Skip to content

Configuration Reference

Complete reference for Alignmenter configuration files.

Run Configuration

Run configs (.yaml) specify evaluation parameters.

Full Example

# configs/brand.yaml
model: "openai:gpt-4o"
persona: "configs/persona/brand.yaml"
dataset: "datasets/test_conversations.jsonl"

evaluation:
  # Score thresholds (fail if below)
  min_authenticity: 0.80
  min_safety: 0.95
  min_stability: 0.85

  # Metric weights (for overall score)
  authenticity_weight: 0.5
  safety_weight: 0.3
  stability_weight: 0.2

generation:
  # Only when --generate-transcripts is used
  temperature: 0.7
  max_tokens: 500
  top_p: 1.0

safety:
  # Keyword patterns to check
  violation_patterns:
    - "hate_speech"
    - "violence"
    - "self_harm"

  # Offline classifier (if installed)
  use_offline_classifier: false

judge:
  # Optional LLM judge
  enabled: false
  provider: "openai:gpt-4o-mini"
  sample_rate: 0.2
  budget: 1.00
  strategy: "random"  # random, on_failure, stratified

output:
  dir: "reports/"
  format: "html"  # html, json, csv
  include_json: true
  include_csv: true
  open_browser: true

reproducibility:
  seed: 42
  cache_responses: true

Persona Configuration

Personas define your brand voice.

Full Example

# configs/persona/brand.yaml
id: brand-assistant
name: "Brand Assistant"
version: "1.0.0"
description: "Professional, helpful, evidence-driven support bot"

voice:
  tone:
    - professional
    - helpful
    - precise
    - friendly

  formality: business_casual  # formal, business_casual, casual

  verbosity: balanced  # concise, balanced, detailed

  lexicon:
    preferred:
      - "I'd be happy to"
      - "let me assist you"
      - "based on our analysis"
      - "the data indicates"

    avoided:
      - "no problem"
      - "sure thing"
      - "absolutely"
      - "lol"
      - "hype"

examples:
  - "I'd be happy to help you with that request. Let me look into the details."
  - "Based on our analysis, the baseline performance shows a 15% improvement."
  - "The data indicates strong signal across all test cases."
  - "Let me assist you in finding the right solution for your needs."

traits:
  uses_evidence: true
  cites_sources: true
  asks_clarifying_questions: true
  maintains_context: true

guidelines:
  - "Always acknowledge the user's request before responding"
  - "Use data and examples to support claims"
  - "Avoid slang and overly casual language"
  - "Be precise but approachable"

anti_patterns:
  - "Don't use exclamation marks excessively"
  - "Avoid saying 'I think' or 'I feel'"
  - "Don't make unsupported claims"

Persona Fields

Field Type Description
id string Unique identifier
name string Display name
version string Version number (for tracking changes)
description string Brief summary
voice.tone list Personality traits
voice.formality enum formal, business_casual, casual
voice.verbosity enum concise, balanced, detailed
voice.lexicon.preferred list Words to use
voice.lexicon.avoided list Words to avoid
examples list Reference responses
traits dict Boolean trait flags
guidelines list Behavioral rules
anti_patterns list What not to do

Dataset Format

Datasets are JSONL (one JSON object per line).

Basic Format

{"session_id": "001", "turn": 1, "user": "Hello!", "assistant": "Hi! How can I help you today?"}
{"session_id": "001", "turn": 2, "user": "Tell me about your product", "assistant": "Our product is..."}
{"session_id": "002", "turn": 1, "user": "What's the weather?", "assistant": "I can help with that..."}

Required Fields

  • session_id (string) - Groups turns into conversations
  • turn (int) - Order within session
  • user (string) - User message

Optional Fields

  • assistant (string) - AI response (if cached, otherwise generated)
  • metadata (object) - Custom fields
  • timestamp (string) - ISO 8601 timestamp

Extended Format

{
  "session_id": "prod_001",
  "turn": 1,
  "user": "What's your refund policy?",
  "assistant": "Our refund policy allows returns within 30 days...",
  "metadata": {
    "user_id": "user_12345",
    "timestamp": "2025-11-06T14:32:00Z",
    "channel": "web_chat"
  }
}

Special Cases

Re-generation mode: Omit assistant to generate fresh responses:

{"session_id": "001", "turn": 1, "user": "Hello!"}

ChatGPT export: Use alignmenter dataset convert:

alignmenter dataset convert chatgpt_export.json dataset.jsonl --from-format chatgpt


Environment Variables

API Keys

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

Cache

export ALIGNMENTER_CACHE_DIR="~/.alignmenter/cache"

Default: ~/.cache/alignmenter/ on Linux/Mac, %LOCALAPPDATA%\alignmenter\ on Windows.

Logging

export ALIGNMENTER_LOG_LEVEL="DEBUG"  # DEBUG, INFO, WARNING, ERROR

Models

export ALIGNMENTER_DEFAULT_MODEL="openai:gpt-4o"

Model Identifiers

OpenAI

openai:gpt-4o
openai:gpt-4o-mini
openai:gpt-4-turbo
openai-gpt:custom-model-id  # Custom GPTs / fine-tunes

Anthropic

anthropic:claude-3-5-sonnet-20241022
anthropic:claude-3-5-haiku-20241022
anthropic:claude-3-opus-20240229

Local

local:vllm:localhost:8000/v1/completions
local:ollama:llama2

Next Steps