Overview
Rule-based conversation grouping (time gaps, sender chaining) is a blunt heuristic. An optional LLM validation pass would catch grouping errors, filter low-quality samples, and improve overall fine-tuning data quality.
What the LLM validator should do
- Conversation boundary detection — judge whether adjacent message groups are topically continuous, catching cases where a time gap split a real conversation or merged two unrelated ones
- Sample quality filtering — score each extracted sample and discard noise (one-word replies, pure greetings, out-of-context fragments)
- Conversation coherence check — given a full multi-turn group, verify it reads as a coherent thread before it enters training data
Design
- LLM validation is opt-out, enabled by default when an API key is present
- Controlled via an environment variable:
DIALOGSMITH_LLM_VALIDATE=true/false
- When disabled, the pipeline falls back to rule-based grouping only (current behaviour)
- To avoid cost/latency at scale: rule-based logic does the heavy lifting first, LLM runs a filter pass on extracted samples rather than on every raw message
Configuration
DIALOGSMITH_LLM_VALIDATE=true # default: true if API key present
DIALOGSMITH_LLM_MODEL=... # model to use for validation
ANTHROPIC_API_KEY=... # or equivalent
Acceptance criteria
Overview
Rule-based conversation grouping (time gaps, sender chaining) is a blunt heuristic. An optional LLM validation pass would catch grouping errors, filter low-quality samples, and improve overall fine-tuning data quality.
What the LLM validator should do
Design
DIALOGSMITH_LLM_VALIDATE=true/falseConfiguration
Acceptance criteria
DIALOGSMITH_LLM_VALIDATE=false