Skip to content

Latest commit

 

History

History
296 lines (215 loc) · 12.9 KB

File metadata and controls

296 lines (215 loc) · 12.9 KB

API Reference

Back to README | All docs

Complete reference for all exports from context-compression-engine.

Exports

// Primary
export { compress, defaultTokenCounter } from './compress.js';
export { uncompress } from './expand.js';
export type { StoreLookup } from './expand.js';

// Helpers (LLM integration)
export { createSummarizer, createEscalatingSummarizer } from './summarizer.js';

// Types
export type {
  CompressOptions,
  CompressResult,
  CreateSummarizerOptions,
  Message,
  Summarizer,
  UncompressOptions,
  UncompressResult,
  VerbatimMap,
} from './types.js';

compress

Deterministic compression by default. Returns a Promise when a summarizer is provided.

Signatures

function compress(messages: Message[], options?: CompressOptions): CompressResult;
function compress(
  messages: Message[],
  options: CompressOptions & { summarizer: Summarizer },
): Promise<CompressResult>;

Parameters

Parameter Type Description
messages Message[] Messages to compress
options CompressOptions Compression options (see below)

CompressOptions

Option Type Default Description
preserve string[] ['system'] Roles to never compress
recencyWindow number 4 Protect the last N messages from compression
sourceVersion number 0 Version tag for provenance tracking
summarizer Summarizer - LLM-powered summarizer. When provided, compress() returns a Promise. See LLM integration
tokenBudget number - Target token count. Binary-searches recencyWindow to fit. See Token budget
minRecencyWindow number 0 Floor for recencyWindow when using tokenBudget
dedup boolean true Replace earlier exact-duplicate messages with a compact reference. See Deduplication
fuzzyDedup boolean false Detect near-duplicate messages using line-level similarity. See Deduplication
fuzzyThreshold number 0.85 Similarity threshold for fuzzy dedup (0-1)
embedSummaryId boolean false Embed summary_id in compressed content for downstream reference. See Provenance
forceConverge boolean false Hard-truncate non-recency messages when binary search bottoms out. See Token budget
tokenCounter (msg: Message) => number defaultTokenCounter Custom token counter per message. See Token budget

CompressResult

Field Type Description
messages Message[] Compressed message array
verbatim VerbatimMap Original messages keyed by ID. Must be persisted atomically with messages
compression.original_version number Mirrors sourceVersion
compression.ratio number Character-based compression ratio. >1 means savings
compression.token_ratio number Token-based compression ratio. >1 means savings
compression.messages_compressed number Messages that were compressed
compression.messages_preserved number Messages kept as-is
compression.messages_deduped number | undefined Exact duplicates replaced (when dedup: true)
compression.messages_fuzzy_deduped number | undefined Near-duplicates replaced (when fuzzyDedup: true)
fits boolean | undefined Whether result fits within tokenBudget. Present when tokenBudget is set
tokenCount number | undefined Estimated token count. Present when tokenBudget is set
recencyWindow number | undefined The recencyWindow the binary search settled on. Present when tokenBudget is set

Example

import { compress } from 'context-compression-engine';

// Sync
const result = compress(messages, {
  preserve: ['system'],
  recencyWindow: 4,
  sourceVersion: 1,
});

// Async (with LLM summarizer)
const result = await compress(messages, {
  summarizer: async (text) => myLlm.summarize(text),
});

uncompress

Restore originals from the verbatim store. Always synchronous. See Round-trip for full details.

Signature

function uncompress(
  messages: Message[],
  store: StoreLookup,
  options?: UncompressOptions,
): UncompressResult;

Parameters

Parameter Type Description
messages Message[] Compressed messages to expand
store StoreLookup VerbatimMap object or (id: string) => Message | undefined function
options UncompressOptions Expansion options (see below)

UncompressOptions

Option Type Default Description
recursive boolean false Recursively expand messages whose originals are also compressed (up to 10 levels)

UncompressResult

Field Type Description
messages Message[] Expanded messages
messages_expanded number How many compressed messages were restored
messages_passthrough number How many messages passed through unchanged
missing_ids string[] IDs looked up but not found. Non-empty = data loss

Example

import { uncompress } from 'context-compression-engine';

const { messages, missing_ids } = uncompress(compressed, verbatim);

// Recursive expansion
const deep = uncompress(compressed, verbatim, { recursive: true });

// Function store (database-backed)
const result = uncompress(compressed, (id) => db.getMessageById(id));

defaultTokenCounter

The built-in token estimator used when no custom tokenCounter is provided.

Signature

function defaultTokenCounter(msg: Message): number;

Formula

Math.ceil(msg.content.length / 3.5);

The 3.5 chars/token ratio is the empirical average for GPT-family BPE tokenizers (cl100k_base, o200k_base) on mixed English text. The lower end of the range (~3.2–4.5) is chosen intentionally so budget estimates stay conservative — over-counting tokens is safer than under-counting. For accurate budgeting, replace with a real tokenizer. See Token budget.


createSummarizer

Creates an LLM-powered summarizer with an optimized prompt template. See LLM integration for provider examples.

Signature

function createSummarizer(
  callLlm: (prompt: string) => string | Promise<string>,
  options?: CreateSummarizerOptions,
): Summarizer;

CreateSummarizerOptions

Option Type Default Description
maxResponseTokens number 300 Hint for maximum tokens in the LLM response
systemPrompt string - Domain-specific instructions prepended to the built-in rules
mode 'normal' | 'aggressive' 'normal' 'aggressive' produces terse bullet points at half the token budget
preserveTerms string[] - Domain-specific terms appended to the built-in preserve list

Built-in preserve list

The prompt always preserves: code references, file paths, function/variable names, URLs, API keys, error messages, numbers, and technical decisions. Add domain terms via preserveTerms.

Example

import { createSummarizer, compress } from 'context-compression-engine';

const summarizer = createSummarizer(async (prompt) => myLlm.complete(prompt), {
  maxResponseTokens: 300,
  systemPrompt: 'This is a legal contract. Preserve all clause numbers.',
  preserveTerms: ['clause numbers', 'party names'],
});

const result = await compress(messages, { summarizer });

createEscalatingSummarizer

Three-level escalation summarizer. See LLM integration and Compression pipeline for how the fallback chain works.

Signature

function createEscalatingSummarizer(
  callLlm: (prompt: string) => string | Promise<string>,
  options?: Omit<CreateSummarizerOptions, 'mode'>,
): Summarizer;

Escalation levels

  1. Level 1: Normal - concise prose summary via the LLM
  2. Level 2: Aggressive - terse bullet points at half the token budget (if Level 1 fails or returns longer text)
  3. Level 3: Deterministic - sentence extraction fallback via the compression pipeline (handled by withFallback in compress)

Options

Same as CreateSummarizerOptions but without mode (managed internally).

Option Type Default Description
maxResponseTokens number 300 Hint for maximum tokens in the LLM response
systemPrompt string - Domain-specific instructions prepended to the built-in rules
preserveTerms string[] - Domain-specific terms appended to the built-in preserve list

Types

Message

type Message = {
  id: string;
  index: number;
  role?: string;
  content?: string;
  metadata?: Record<string, unknown>;
  tool_calls?: unknown[];
  [key: string]: unknown;
};

Summarizer

type Summarizer = (text: string) => string | Promise<string>;

VerbatimMap

type VerbatimMap = Record<string, Message>;

StoreLookup

type StoreLookup = VerbatimMap | ((id: string) => Message | undefined);

See also