Compression Pipeline

How compress() processes messages from start to finish.

Pipeline overview

messages
  |
  v
classify ──> dedup ──> merge consecutive ──> summarize ──> size guard
  |            |              |                  |              |
  |            |              |                  |              v
  |            |              |                  |         preserve original
  |            |              |                  |         if summary >= original
  |            |              |                  v
  |            |              |            LLM or deterministic
  |            |              v
  |            |         same-role groups
  |            v
  |       exact + fuzzy
  v
 T0/T2/T3 + preservation rules

1. Classification

Every message is evaluated against preservation rules in order. Messages that survive all checks are eligible for compression.

The classifier (classifyAll) applies rules in this order:

Role in preserve list (default: ['system']) -> preserved
Within recencyWindow -> preserved
Has tool_calls -> preserved
Content < 120 chars -> preserved
Already compressed ([summary:, [summary#, or [truncated prefix) -> preserved
High importance score (when importanceScoring: true, score >= importanceThreshold) -> preserved
Marked as duplicate by dedup analysis -> dedup path
Superseded by a later correction (when contradictionDetection: true) -> contradiction path
Contains code fences with >= 80 chars of prose -> code-split path
Has code fences with < 80 chars prose -> preserved
Classified as hard T0 (code, JSON, SQL, API keys, etc.) -> preserved
Valid JSON -> preserved
Everything else -> compress

See Preservation rules for classification tiers and the hard vs. soft T0 distinction.

2. Deduplication

Before compression, messages are scanned for duplicates. See Deduplication for full details.

Exact dedup (default: on) - djb2 hash grouping, full string comparison
Fuzzy dedup (opt-in) - fingerprint bucketing + line-level Jaccard similarity

Duplicates are replaced with compact references like [cce:dup of msg_42 - 1234 chars].

3. Merge consecutive

Non-preserved, non-dedup messages with the same role are collected into groups. This merges consecutive messages from the same speaker before summarization, producing tighter summaries.

The collectGroup function walks forward from the current position, collecting messages that are:

Not preserved
Not code-split
Not dedup-annotated
Same role as the first message in the group

4. Summarize

Each group (or standalone message) goes through summarization.

Structured output detection

Before summarizing, the engine checks if content looks like structured tool output (grep results, test output, status lines). Content is classified as structured when:

6+ non-empty lines
Newline density > 1/80
More than 50% of lines match structural patterns (file:line references, bullet points, key-value pairs, PASS/FAIL status words)

Structured output gets a specialized summarizer (summarizeStructured) that extracts file paths and status lines rather than trying to summarize prose.

Deterministic summarization

The summarize function uses sentence scoring:

Split text into paragraphs, then sentences
Score each sentence with scoreSentence:
- +3 per camelCase identifier (e.g., myFunction)
- +3 per PascalCase identifier (e.g., WebSocket)
- +3 per snake_case identifier (e.g., my_var)
- +4 for emphasis phrases (importantly, however, critical, must, etc.)
- +2 per number with units (10 seconds, 500 MB, etc.)
- +2 per vowelless abbreviation (3+ consonants, e.g., npm, ssh)
- +3 per status word (PASS, FAIL, ERROR, WARNING, WARN)
- +2 per grep-style reference (src/foo.ts:42:)
- +2 for optimal length (40-120 chars)
- -10 for filler starters (great, sure, ok, thanks, etc.)
Mark the highest-scored sentence per paragraph as "primary"
Greedy budget packing: primary sentences first (by score), then secondary
Re-sort selected sentences by original position to preserve reading order
Join with ... separator

Budget scales adaptively: max(200, min(round(length × 0.3), 600)). Short content gets 200 chars, long content up to 600.

Entity extraction

After summarizing, extractEntities pulls out key identifiers from the original text:

Proper nouns (excluding common sentence starters)
PascalCase, camelCase, snake_case identifiers
Vowelless abbreviations
Numbers with units/context

Entities scale with content length (3–15) and are appended as | entities: foo, bar, baz.

Code-split processing

Messages containing code fences with significant prose (>= 80 chars) get split:

splitCodeAndProse extracts code fences and surrounding prose separately
Prose is summarized (budget scales adaptively with prose length)
Code fences are preserved verbatim
Result: `[summary: ...]\n\n```code here````

If the code-split result is longer than the original, the message is preserved as-is.

LLM summarization (async path)

When a summarizer is provided, the async path uses withFallback:

Call the user's summarizer
Accept the result only if it's a non-empty string and strictly shorter than the input
If the summarizer throws or returns longer text, fall back to deterministic summarize

This three-level fallback (LLM -> deterministic -> size guard) ensures compression never makes output worse.

5. Size guard

After summarization, the engine checks if the summary (with formatting, entities, merge count) is shorter than the original. If it isn't, the original message is preserved unchanged.

This check happens for:

Single compressed messages
Merged groups
Code-split messages

Output format

Summary format

[summary: {text}{merge_suffix}{entity_suffix}]

{text} - the summary text
{merge_suffix} - (N messages merged) when multiple messages were combined
{entity_suffix} - | entities: foo, bar, baz (omitted for code-split messages)

With embedSummaryId: true:

[summary#{cce_sum_abc123}: {text}{merge_suffix}{entity_suffix}]

Dedup format

[cce:dup of {keepTargetId} — {contentLength} chars]
[cce:near-dup of {keepTargetId} — {contentLength} chars, ~{similarity}% match]

Contradiction format

When contradictionDetection: true, messages superseded by a later correction:

[cce:superseded by {correctionMessageId} ({signal}) — {summaryText}]

If the full format doesn't fit, falls back to compact:

[cce:superseded by {correctionMessageId} — {signal}]

Force-converge format

[truncated — {contentLength} chars: {first 512 chars}]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compression Pipeline

Pipeline overview

1. Classification

2. Deduplication

3. Merge consecutive

4. Summarize

Structured output detection

Deterministic summarization

Entity extraction

Code-split processing

LLM summarization (async path)

5. Size guard

Output format

Summary format

Dedup format

Contradiction format

Force-converge format

See also

FilesExpand file tree

compression-pipeline.md

Latest commit

History

compression-pipeline.md

File metadata and controls

Compression Pipeline

Pipeline overview

1. Classification

2. Deduplication

3. Merge consecutive

4. Summarize

Structured output detection

Deterministic summarization

Entity extraction

Code-split processing

LLM summarization (async path)

5. Size guard

Output format

Summary format

Dedup format

Contradiction format

Force-converge format

See also