How compress() processes messages from start to finish.
messages
|
v
classify ──> dedup ──> merge consecutive ──> summarize ──> size guard
| | | | |
| | | | v
| | | | preserve original
| | | | if summary >= original
| | | v
| | | LLM or deterministic
| | v
| | same-role groups
| v
| exact + fuzzy
v
T0/T2/T3 + preservation rules
Every message is evaluated against preservation rules in order. Messages that survive all checks are eligible for compression.
The classifier (classifyAll) applies rules in this order:
- Role in
preservelist (default:['system']) -> preserved - Within
recencyWindow-> preserved - Has
tool_calls-> preserved - Content < 120 chars -> preserved
- Already compressed (
[summary:,[summary#, or[truncatedprefix) -> preserved - High importance score (when
importanceScoring: true, score >=importanceThreshold) -> preserved - Marked as duplicate by dedup analysis -> dedup path
- Superseded by a later correction (when
contradictionDetection: true) -> contradiction path - Contains code fences with >= 80 chars of prose -> code-split path
- Has code fences with < 80 chars prose -> preserved
- Classified as hard T0 (code, JSON, SQL, API keys, etc.) -> preserved
- Valid JSON -> preserved
- Everything else -> compress
See Preservation rules for classification tiers and the hard vs. soft T0 distinction.
Before compression, messages are scanned for duplicates. See Deduplication for full details.
- Exact dedup (default: on) - djb2 hash grouping, full string comparison
- Fuzzy dedup (opt-in) - fingerprint bucketing + line-level Jaccard similarity
Duplicates are replaced with compact references like [cce:dup of msg_42 - 1234 chars].
Non-preserved, non-dedup messages with the same role are collected into groups. This merges consecutive messages from the same speaker before summarization, producing tighter summaries.
The collectGroup function walks forward from the current position, collecting messages that are:
- Not preserved
- Not code-split
- Not dedup-annotated
- Same role as the first message in the group
Each group (or standalone message) goes through summarization.
Before summarizing, the engine checks if content looks like structured tool output (grep results, test output, status lines). Content is classified as structured when:
- 6+ non-empty lines
- Newline density > 1/80
- More than 50% of lines match structural patterns (file:line references, bullet points, key-value pairs, PASS/FAIL status words)
Structured output gets a specialized summarizer (summarizeStructured) that extracts file paths and status lines rather than trying to summarize prose.
The summarize function uses sentence scoring:
- Split text into paragraphs, then sentences
- Score each sentence with
scoreSentence:- +3 per camelCase identifier (e.g.,
myFunction) - +3 per PascalCase identifier (e.g.,
WebSocket) - +3 per snake_case identifier (e.g.,
my_var) - +4 for emphasis phrases (
importantly,however,critical,must, etc.) - +2 per number with units (
10 seconds,500 MB, etc.) - +2 per vowelless abbreviation (3+ consonants, e.g.,
npm,ssh) - +3 per status word (
PASS,FAIL,ERROR,WARNING,WARN) - +2 per grep-style reference (
src/foo.ts:42:) - +2 for optimal length (40-120 chars)
- -10 for filler starters (
great,sure,ok,thanks, etc.)
- +3 per camelCase identifier (e.g.,
- Mark the highest-scored sentence per paragraph as "primary"
- Greedy budget packing: primary sentences first (by score), then secondary
- Re-sort selected sentences by original position to preserve reading order
- Join with
...separator
Budget scales adaptively: max(200, min(round(length × 0.3), 600)). Short content gets 200 chars, long content up to 600.
After summarizing, extractEntities pulls out key identifiers from the original text:
- Proper nouns (excluding common sentence starters)
- PascalCase, camelCase, snake_case identifiers
- Vowelless abbreviations
- Numbers with units/context
Entities scale with content length (3–15) and are appended as | entities: foo, bar, baz.
Messages containing code fences with significant prose (>= 80 chars) get split:
splitCodeAndProseextracts code fences and surrounding prose separately- Prose is summarized (budget scales adaptively with prose length)
- Code fences are preserved verbatim
- Result: `[summary: ...]\n\n```code here````
If the code-split result is longer than the original, the message is preserved as-is.
When a summarizer is provided, the async path uses withFallback:
- Call the user's summarizer
- Accept the result only if it's a non-empty string and strictly shorter than the input
- If the summarizer throws or returns longer text, fall back to deterministic
summarize
This three-level fallback (LLM -> deterministic -> size guard) ensures compression never makes output worse.
After summarization, the engine checks if the summary (with formatting, entities, merge count) is shorter than the original. If it isn't, the original message is preserved unchanged.
This check happens for:
- Single compressed messages
- Merged groups
- Code-split messages
[summary: {text}{merge_suffix}{entity_suffix}]
{text}- the summary text{merge_suffix}-(N messages merged)when multiple messages were combined{entity_suffix}-| entities: foo, bar, baz(omitted for code-split messages)
With embedSummaryId: true:
[summary#{cce_sum_abc123}: {text}{merge_suffix}{entity_suffix}]
[cce:dup of {keepTargetId} — {contentLength} chars]
[cce:near-dup of {keepTargetId} — {contentLength} chars, ~{similarity}% match]
When contradictionDetection: true, messages superseded by a later correction:
[cce:superseded by {correctionMessageId} ({signal}) — {summaryText}]
If the full format doesn't fit, falls back to compact:
[cce:superseded by {correctionMessageId} — {signal}]
[truncated — {contentLength} chars: {first 512 chars}]
- Preservation rules - classification details
- Deduplication - exact and fuzzy dedup algorithms
- Token budget - budget-driven compression with binary search
- LLM integration - summarizer setup
- Provenance - metadata attached to compressed messages
- API reference - full signatures and types