ContextGuard

ContextGuard is a local-first context management toolkit for AI coding and tool agents. It starts as a Claude Code plugin: install once, apply per project, and roll back when needed.

Its guardrails trim noisy output, prefer symbol-level reads, nudge repeated failures, redact secret-like patterns, and measure usage. The same guardrails extend to other agents through local helper commands and advisory brief-mode rule snippets.

Korean documentation: README.ko.md
Static landing page: GitHub Pages (source)

TL;DR

Installation and activation are deliberately separate. Installing ContextGuard only makes local helpers or Claude plugin skills available. Configuration changes happen later through an explicit setup command.

If you use...	Install	Activate
Claude Code	`/plugin marketplace add ictechgy/context-guard` then `/plugin install context-guard@context-guard`	Run `/context-guard:setup` inside the project.
Codex CLI or any terminal-first agent	`npm install -g @ictechgy/context-guard` or one-shot `npx @ictechgy/context-guard ...`	`context-guard setup --agent codex --scope project --with-init --with-skill --plan`, then rerun with `--yes`.
Other rule-file agents	npm/npx install above	`context-guard setup --agent gemini,cursor,windsurf,cline,copilot --scope project --with-init --plan`, then apply only the agents you want.
macOS/Homebrew users	release path: `brew install ictechgy/tap/context-guard`	Same `context-guard setup ...` commands after install.

Common commands:

npm install -g @ictechgy/context-guard
npx @ictechgy/context-guard --version
context-guard doctor --root . --json              # read-only health check; no changes made
context-guard setup --agent codex --scope project --with-init --with-skill --plan
context-guard setup --agent claude --scope user --verify --json  # read-only user-scope check
context-guard setup --agent claude --scope user --plan

Project scope is the default. User-level setup is opt-in, requires an explicit agent for writes, records backups and rollback metadata, and never runs during package installation. Use context-guard doctor or context-guard setup --verify for a read-only health check before applying setup. doctor reports next commands and makes no changes. Setup resolves bundled or checkout-local helpers first and does not trust arbitrary PATH helpers unless you explicitly pass --allow-path-helper-fallback for a known-good install.

ContextGuard is intentionally conservative about savings claims. It reduces common sources of context bloat and provides benchmark tooling so you can measure before/after results on your own tasks. It does not promise a fixed token or cost reduction for every repository.

Claude Code first, other agents too

ContextGuard ships first as a Claude Code plugin, which is still the fastest path to value. After installation, the same local-first guardrails can be reused by other AI coding and tool agents through:

Local helper commands (context-guard-*) that run as plain shell commands, independent of any specific agent.
Advisory brief-mode rule snippets that you install into an agent's own instruction file (AGENTS.md, GEMINI.md, .cursorrules, Copilot instructions, and similar rule files) and remove by deleting the marker-delimited block.
Dry-run cross-agent setup that writes only local files, backs up before changing anything, and applies only with explicit approval.

Current setup surfaces:

Agent or tool	ContextGuard surface
Claude Code	Native plugin setup for project-local hooks, deny rules, and statusline configuration.
OpenAI Codex CLI	Advisory `AGENTS.md` rule block plus optional project skill at `.agents/skills/context-guard/SKILL.md`.
Gemini CLI	Advisory `GEMINI.md` rule block.
Cursor	Advisory project-rule block, usually `.cursorrules`.
Windsurf	Advisory `.windsurf/rules/contextguard.md` rule block.
Cline	Advisory `.clinerules` rule block, with file/directory handling.
GitHub Copilot Coding Agent	Advisory `.github/copilot-instructions.md` rule block.
OpenCode, ForgeCode, or unknown agents	Manual shell-helper usage with local evidence; no automatic hooks.

How ContextGuard reduces token waste

ContextGuard does not make the model cheaper by itself. It reduces avoidable context before it reaches an AI coding agent, then gives you signals to measure whether the change helped.

Waste path	ContextGuard guardrail
Whole-file reads for one function	Suggest search, symbol slices, bounded outlines, and small line ranges before a full read.
Long test, build, search, or diff output	Trim output, emit structured digests, or store large logs locally and return compact receipts.
Repeated failing commands	Warn after repeated Bash failures so the agent changes strategy before more stale logs enter context.
Secret-like or noisy terminal output	Apply best-effort pattern-based redaction for common credential patterns and sensitive-looking paths before output is copied into context.
Unknown token/cost hotspots	Surface statusline signals, transcript audits, and matched benchmark reports for before/after evidence.
Anthropic API requests that may miss prompt cache	`context-guard cost preflight` estimates input size, breakpoint-level cache risk, and low/mid/high cost ranges before a call; default mode warns only.
Volatile context before stable prompt prefixes	Audit bounded redacted prompt-segment hashes and flag likely cache-unfriendly prompt layouts without exposing raw prompt text.
Large tool/MCP catalogs for one narrow task	Rank a local tool catalog into a bounded top-k schema report while keeping full sanitized schemas retrievable from local receipts.

How it fits with caching and compression tools

ContextGuard complements provider and semantic caches, and sits next to prompt compression. Its main job is simpler: do not send unnecessary files, logs, or output in the first place.

Tool category	Saves by	ContextGuard relationship
Provider prompt/context caching	Reusing stable prompt prefixes.	Complementary; ContextGuard helps keep the changing tail of context smaller and cleaner, `context-guard-audit` can flag likely volatile prefix layouts, and `context-guard cost` can warn when an Anthropic request is likely to create/cache-write instead of read.
Semantic response cache	Reusing answers to identical or similar requests.	Complementary; ContextGuard does not serve cached AI answers.
Prompt/context compression	Shortening text that is already selected for the model.	Adjacent; ContextGuard trims and summarizes local output, but does not promise lossless semantic compression.
Experimental planners and local runtimes	Default-off and explicit-command-only; covers local-proxy plans and gate records plus narrow local runtimes for caller-supplied context-diff, visual evidence-pack, learned-compression, and self-hosted metrics evidence.	The local proxy `record` command starts no listener and forwards no traffic; `serve local-proxy` binds and forwards only literal loopback IPs for one bounded request. No compressor/model execution, OCR/crop service, external forwarding, credential persistence, or hosted-savings claim ships without separate evidence and future PR gates.
ContextGuard	Avoiding unnecessary files, logs, repeated failures, and noisy output before they enter agent context.	Local guardrails, reversible artifacts, and measurement.

Related patterns that informed the design:

Approach	What it emphasizes	ContextGuard relationship
Compression-first	Shortening text already selected for the model, often with lossy transforms.	ContextGuard prefers local artifact storage with exact slice retrieval over lossy one-way compression, so you can get the original back.
Terse-output rulesets across agents	Installing brief-mode output rules into many agents at once.	ContextGuard offers advisory brief-mode snippets and dry-run cross-agent setup — opt-in per project, no guaranteed savings claimed.
ContextGuard	Avoiding unnecessary files, logs, and output before they enter context, with conservative measurement.	Local guardrails, reversible artifacts and retrieval, and benchmark evidence you measure yourself.

Brief mode (advisory)

Brief mode is a set of agent-neutral, advisory rule snippets that ask a coding agent to cut filler while preserving the evidence a reviewer needs: file paths, commands, command output and errors, code blocks, verification status, changed files, known gaps, and caveats. It is best-effort guidance, not enforcement, and does not guarantee any token or cost savings.

Three deterministic levels ship under plugins/context-guard/brief/: lite, standard, and ultra. Each level is a single marker-delimited block for an agent's rule/instruction file (for example AGENTS.md, CLAUDE.md, a Cursor rules file, or Copilot instructions). Manage it through setup with context-guard setup --agent codex --scope project --brief-mode standard --plan, rerun with --yes to apply, and use --brief-mode off to remove the managed block. See plugins/context-guard/brief/README.md.

What to measure

When you need a savings claim, measure it on your own tasks:

full-file reads versus symbol or line-range reads
raw logs versus digest output or artifact receipts
transcript hotspots reported by context-guard-audit, including cache_friendliness prompt-layout signals and cache_layout_advice experiment priorities
statusline cache / reuse as observed transcript/provider-cache signals, not savings caused by ContextGuard
context-guard cost preflight estimates for Anthropic request JSON, followed by context-guard cost observe using provider usage fields (cache_creation_input_tokens, cache_read_input_tokens) after the call
static prompt/request cache layout checks from context-guard-cache-score; its char/4 token estimates and warnings are advisory only until provider usage fields confirm real cache hits
matched successful baseline/variant runs from context-guard-bench
large tool/MCP catalogs versus context-guard-tool-prune top-k reports plus receipt retrieval
optional experimental lanes in research/experimental-token-reduction-radar.md; fixture-only starters in docs/experimental-benchmark-fixtures.md use the same matched-task benchmark gates before any savings claim

What ContextGuard does not do

It does not guarantee a fixed token or cost reduction.
It does not send work to external AI providers to save model tokens.
It does not mutate global Claude settings during install.
It does not replace real before/after measurement when you need a savings claim.
Local RAM/disk receipts can reduce what you send next, but they do not replace Anthropic's provider prompt cache or guarantee cache hits. Recheck Anthropic prompt-caching and pricing docs before release or billing claims: https://docs.anthropic.com/en/build-with-claude/prompt-caching and https://platform.claude.com/docs/en/about-claude/pricing.
Experimental helpers are mostly dry-run checker/planner surfaces, including a design-only external-forwarding opt-in gate. Explicit local runtimes exist only for caller-supplied context-diff replacement payloads, caller-supplied visual crop/OCR evidence packs, caller-supplied learned-compression prose candidates, self-hosted metrics JSONL sidecar records, local-proxy runtime-gate JSONL records, and one-shot serve local-proxy loopback forwarding with a private ready-file nonce plus optional shifted-cost diagnostic JSONL rows for successful forwarded requests.
ContextGuard does not ship learned/synthetic compressor execution, embeddings, rerankers, model calls, generated replacement text, screenshot capture, image cropping, OCR execution, image parsing, external OCR/image services, self-hosted KV/latent inference optimization beyond explicit local metrics recording, or broader proxy forwarding beyond literal-loopback, one-request HTTP forwarding with credential material blocked.
It does not alias the old /claude-token-optimizer:* Claude Code slash-command namespace. Use /context-guard:* after installing this plugin.

Legacy local CLI wrappers (claude-token-*, claude-read-symbol, claude-trim-output, and claude-sanitize-output) still ship in bin/ so existing automation can migrate gradually.

Features

Feature	What it helps with
Claude Code plugin skills	Guided setup, optimization, and transcript usage audits.
Project-local setup wizard	Applies recommended `.claude/settings.json` options without touching global settings.
Context management scanner	Finds missing guardrails, noisy hooks, broad reads, large context files, secret-like files, excessive MCP servers, and expensive defaults.
Structural-waste doctor	Opt-in local diagnostics for duplicate rules, stale imports, unused skill candidates, oversized tool schemas, and repeated read/tool-call loops.
Large-read guard and symbol reader	Nudges the agent toward `rg`, symbol reads, and small line ranges instead of full-file reads.
Output trimming and sanitizing	Keeps test, build, search, and diff output compact while redacting likely secrets before they enter agent context.
Declarative output filter	Opt-in JSON DSL for user-owned command filters with protected failure passthrough and validation before use.
Local artifact store	Saves large sanitized logs outside the conversation and returns compact receipts or exact requested slices.
Anthropic cost guard	`context-guard cost preflight/observe/ledger/compile` estimates cache-risk and cost ranges, stores only keyed HMAC fingerprints, and stays passive unless `--enforce` is explicit.
Budgeted context packer	Assembles prioritized local file evidence into a byte-budgeted Markdown pack, can suggest a build-compatible manifest from local signals, adds `--explain` for compact local selection reasons plus bounded repo-map metadata, and adds opt-in `--adaptive-k` local top-k advisory metadata.
Tool/MCP schema pruner	Emits bounded top-k tool/schema advisory reports from local catalogs with compact receipts and full sanitized payload retrieval.
Conservative stdin compressor	Shrinks selected JSON, diffs, logs, search output, code, and prose with observed byte evidence and estimated token proxies.
Protected-zone policy receipts	Opt-in `context-guard-compress --protected-policy` and `context-guard cost compile` metadata mark code/diff/path/hash/JSON/literal zones as structural-only with exact retrieval guidance.
Repeated-failure nudge	Warns after repeated Bash failures so the agent changes strategy before stale logs fill the context.
Statusline, audit, and benchmarks	Shows context/cache/cost signals, finds usage and cache-friendliness hotspots, and records conservative before/after evidence.

Cost guard key provisioning

Cost guard creates its local HMAC key automatically at .context-guard/cost-ledger/hmac.key. If you provision that file yourself, it must contain exactly one canonical URL-safe base64 32-byte key with required padding and no trailing newline or whitespace. Reports never emit the key or raw prompt text, and the local ledger does not replace Anthropic/provider prompt caching.

Install in Claude Code

Add the marketplace and install the plugin:

/plugin marketplace add ictechgy/context-guard
/plugin install context-guard@context-guard

Then run setup from Claude Code in the project you want to protect:

/context-guard:setup

Available plugin skills:

Skill	Purpose
`/context-guard:setup`	First-time project setup wizard.
`/context-guard:optimize`	Inspect and tune context guardrails.
`/context-guard:audit`	Audit local Claude transcript token/cost hotspots.

Setup is explicit, project-local, and reversible. The plugin does not configure external model delegation or offload; all helper commands run locally. See plugins/context-guard/examples/settings.example.json for an example settings file.

Install with npm/npx

The npm package exposes a canonical context-guard command plus backward-compatible context-guard-* helper commands. Package installation is passive: there is no postinstall setup hook and no config write until you run context-guard setup yourself. If setup cannot find bundled or checkout-local helpers, PATH fallback remains disabled by default; use --allow-path-helper-fallback only for trusted helper directories after context-guard doctor or setup --verify confirms the plan.

npm install -g @ictechgy/context-guard
context-guard --version
context-guard doctor --root . --json
context-guard setup --agent codex --scope project --with-init --with-skill --plan
context-guard setup --agent codex --scope project --brief-mode standard --plan

For a one-off run without global installation:

npx @ictechgy/context-guard setup --agent codex --scope project --with-init --with-skill --plan
npx @ictechgy/context-guard setup --agent codex --scope project --brief-mode standard --plan
npm exec @ictechgy/context-guard -- --version

Use --scope project for repository files such as AGENTS.md and .agents/skills/.... Use --scope user only when you intentionally want a user-level path; applying user scope requires --yes plus an explicit --agent, and supported writes record rollback metadata.

Homebrew release path

Homebrew is available through the shared ictechgy/tap tap:

brew install ictechgy/tap/context-guard
context-guard --version

If you already tapped ictechgy/tap, brew install context-guard also works.

Helper commands

Most users should start with /context-guard:setup. The helper commands below are useful for local testing, automation, or targeted debugging. The canonical command prefix is context-guard-*.

Health check before setup

context-guard doctor --root . --json
context-guard setup --agent claude --scope user --verify --json

Both modes are read-only configuration checks. doctor reports recommended next commands, and setup --verify checks whether setup is complete without applying changes. With --json, the report is written to stdout.

Scan context management

./plugins/context-guard/bin/context-guard-diet scan .

The scanner reports missing guardrails, noisy hooks, broad context paths, large or secret-like instruction/rule files across common AI-agent surfaces, and local context-exclusion recommendations for bulky or sensitive paths. --top caps both the reported context-like files and context-exclusion recommendations. Recommendations are heuristic/advisory unless they are emitted as Claude permissions.deny entries.

Diagnose structural context waste

./plugins/context-guard/bin/context-guard-diet structural-waste . \
  --tool-catalog tools.json \
  --log-path .claude \
  --json

The structural-waste doctor is opt-in and read-only. It reuses the diet scanner's local safety model, then adds advisory findings for duplicate rule units, stale Python imports, unused skill candidates, excessive MCP/tool schema catalogs, and repeated file reads or duplicate tool calls from local JSON/JSONL logs. It does not edit files, disable tools, call the network, or print raw prompt/tool-input text; default output uses relative paths, hashed labels, and redacted secret-shaped path components. Treat low-confidence import/skill findings as review prompts, not deletion instructions.

Read symbols instead of whole large files

./plugins/context-guard/bin/context-guard-read-symbol path/to/file.py TargetSymbol

The optional Read guard uses a progressive path for oversized files: search first, then symbol slices, then small line ranges. When possible, it also returns a bounded top-level outline. Repeated attempts to full-read the same oversized file get a deduplicated warning instead of repeating the same context-heavy path.

Store and query large logs locally

long-command 2>&1 | ./plugins/context-guard/bin/context-guard-artifact store --command "long-command" --json
./plugins/context-guard/bin/context-guard-artifact search "ERROR" --json
./plugins/context-guard/bin/context-guard-artifact get <artifact_id> --lines 1:80

Artifact mode is for capture, sandbox search, and retrieval. It stores sanitized output under .context-guard/artifacts by default and can still read legacy .claude-token-optimizer/artifacts receipts from before the rebrand. JSON receipts include line-numbered top-error receipts, duplicate-line groups, and sanitized bounded suggested_queries so an agent can fetch the smallest useful exact slice instead of replaying the full log. search scans the local sanitized artifact sandbox by literal substring, returns capped match/context records, and includes context-guard-artifact get ... --lines START:END rehydration commands for omitted detail. For custom --dir values, raw private paths stay redacted by default; rerun with the same --dir, or pass search --show-paths when you explicitly want a directly executable local command. The search report is local-only and does not make hosted token/cost savings claims. When --max-lines accompanies a --lines START:END selector, it caps lines returned within that range; it does not expand the selector. Preserve the producer command's exit code yourself when using shell pipelines in release checks, or use context-guard-trim-output -- ... when exit-code preservation is the primary requirement.

Build a budgeted context pack

./plugins/context-guard/bin/context-guard-pack auto \
  --root . \
  --query "review failing tests" \
  --diff HEAD \
  --manifest-out suggested-pack.json \
  --pack-out context-pack.md \
  --budget-bytes 12000 --json --explain --adaptive-k
# Or run the two explicit steps:
./plugins/context-guard/bin/context-guard-pack suggest \
  --root . --query "review failing tests" --diff HEAD \
  --manifest-out suggested-pack.json --budget-bytes 12000 --json --adaptive-k
./plugins/context-guard/bin/context-guard-pack build \
  --root . --manifest suggested-pack.json --budget-bytes 12000 --json
./plugins/context-guard/bin/context-guard-pack slice --root . --path README.md --lines 1:40 --json

context-guard-pack auto is the one-command, local-only path: it runs the suggestion step and immediately builds the budgeted Markdown pack.

A few boundaries are intentional:

Add --explain for compact deterministic local selection/build reasons in JSON or text output.
--explain may include bounded repo_map metadata: sampled byte/token-proxy tree entries, category-only secret-risk counts, signature-first file hints, explain-only graph ranks, and exact slice/symbol retrieval hints.
Explain metadata does not change the manifest, pack body, receipt, or byte budget. It does not use network/model/embedding calls, and token values remain local chars_div_4 proxies rather than provider-token or savings claims.
Add --adaptive-k to suggest or auto for advisory-only shrink/expand top-k metadata derived from local score distribution, byte-budget fit, and score-mass recall/precision proxies. It never applies the recommendation automatically and does not change the manifest, pack body, receipt, or byte budget.
--manifest-out writes a build-compatible manifest; --pack-out saves the rendered pack.
context-guard-pack suggest is the lower-level additive local-only planning step. It ranks candidate files and line ranges from --query, --diff, repeated --files, and optional sanitized --output / --test-output files under --root, then writes a manifest that build --manifest can consume.
context-guard-pack build assembles prioritized local file evidence into a Markdown body whose rendered UTF-8 bytes stay within --budget-bytes. JSON output records included, partial, duplicate, unsafe, missing, and budget-omitted sources.
Bounded receipts are stored under .context-guard/packs. When path/root display is safe, JSON output includes copy-pasteable slice commands for exact sanitized retrieval; otherwise it records retrieval_omitted_reason.

The packer uses deterministic standard-library heuristics only: no network, model calls, embeddings, or provider-cost estimate. Byte counts are observed; token counts remain estimated chars_div_4 proxies, not measured provider-token savings.

Prune a tool/MCP catalog for a task

./plugins/context-guard/bin/context-guard-tool-prune select \
  --catalog tools.json \
  --query "review failing tests" \
  --top 5 --budget-bytes 12000 --json
./plugins/context-guard/bin/context-guard-tool-prune defer-report \
  --catalog tools.json \
  --query "review failing tests" \
  --core-top 3 --deferred-top 20 --json
./plugins/context-guard/bin/context-guard-tool-prune get <receipt_id> --tool read_file --json

context-guard-tool-prune ranks a local tool or MCP catalog with deterministic lexical heuristics and emits a bounded top-k advisory report. Inline selected schemas respect an observed UTF-8 byte budget, and omitted or budget-skipped schemas remain recoverable from a compact local receipt plus a separate sanitized payload under .context-guard/tool-prune. defer-report uses the same receipt path to split a catalog into core inline tools plus deferred tool stubs and namespace summaries. This is advisory only: it does not mutate MCP configuration, does not configure native provider tool search, and token counts remain estimated proxies rather than measured provider savings.

Score static prompt cacheability

./plugins/context-guard/bin/context-guard-cache-score --input prompt.json --provider openai --json
./plugins/context-guard/bin/context-guard cache-score --input prompt.txt --provider anthropic --json

context-guard-cache-score is a local static lint for prompt/request layout. It estimates total and cacheable-prefix size with a tokenizer-free char/4 proxy, warns about dynamic-looking values near the prefix, and records provider caveats for OpenAI, Anthropic, Gemini, or a generic threshold. It does not call providers, store raw prompts, estimate prices, observe cache hits, or prove token/cost savings; verify real cache behavior with provider usage telemetry.

Compress selected local text conservatively

git diff | ./plugins/context-guard/bin/context-guard-compress --json
pytest -q 2>&1 | ./plugins/context-guard/bin/context-guard-compress --type log
cat evidence.txt | ./plugins/context-guard/bin/context-guard-compress --json --protected-policy

context-guard-compress classifies sanitized stdin as JSON, diff, log, search output, code, or prose, then applies deterministic reductions such as JSON compaction, diff context folding, duplicate log/search line collapse, and whitespace normalization. It never claims observed model-token savings; byte counts are observed, token counts are labeled as estimates, and lossy receipts point you back to context-guard-artifact store for exact retrieval.

Add --protected-policy when the input may contain semantic-sensitive zones such as code fences, diffs, identifiers, numeric constants, hashes, paths, stack frames, quoted strings, or JSON keys. The flag does not change default compressor behavior; it adds protected_zone_policy and transform_policy metadata that denies semantic/paraphrase rewrites, allows only structural transforms plus artifact retrieval, and stores only class/count policy metadata rather than raw protected spans.

Trim or summarize command output

./plugins/context-guard/bin/context-guard-trim-output --max-lines 120 -- npm test

Use --digest markdown or --digest json for a compact semantic digest instead of head/tail logs. Digest mode keeps status, exit code, truncation counts, runner failure facts, a sanitized failure signature, duplicate-line groups, representative lines, redaction counts, and suggested next queries while preserving the wrapped command exit code. Add --artifact-receipt with digest mode when you want the exact sanitized full output stored locally as a context-guard-artifact receipt; re-expand with the emitted context-guard-artifact get ... command before relying on omitted details. Wrapped commands time out after 600 seconds by default; tune this with --timeout-seconds.

Sanitize search and diff output

./plugins/context-guard/bin/context-guard-sanitize-output -- rg -n "TOKEN|SECRET" .
./plugins/context-guard/bin/context-guard-sanitize-output -- git diff

The sanitizer reduces the chance that token-like, key-like, password-like, or sensitive path values are copied into agent context.

Apply an opt-in declarative output filter

cat > .context-guard/filter-dsl.json <<'JSON'
{
  "schema_version": "contextguard.filter-dsl.v1",
  "filters": [
    {
      "id": "git-status-short",
      "match": {"argv_prefix": ["git", "status", "--short"]},
      "include_regex": ["^[ MADRCU?!]"],
      "max_lines": 80
    }
  ]
}
JSON
./plugins/context-guard/bin/context-guard-filter validate --config .context-guard/filter-dsl.json
./plugins/context-guard/bin/context-guard-filter run --config .context-guard/filter-dsl.json -- git status --short

context-guard-filter is an opt-in local helper for user-owned JSON filter files; it does not install default filters or change hooks. Invalid configs, no-match commands, filtering errors, empty filtered output, and protected git/test/lint/gh command failures pass the original command stdout/stderr and exit code through. In filtered mode, line rules apply to combined stdout+stderr and write the filtered result to stdout; passthrough mode preserves stdout/stderr streams. run --json-report writes filter diagnostics to stderr so stdout remains command/filter output; protected nonzero passthrough suppresses that report to keep stderr raw. Treat filtered byte reductions as local presentation changes, not hosted token/cost savings claims.

Audit local transcript usage

./plugins/context-guard/bin/context-guard-audit ~/.claude/projects --top 20 --recommend

The audit command skips oversized transcript files and JSONL records by default (--max-file-bytes, --max-line-bytes) and reports skipped counts. That keeps a corrupt trace from dominating memory or hiding scan gaps.

JSON output can include several evidence surfaces:

cache_friendliness and cache_diagnostics: heuristic prompt-layout/cache-read diagnostics built from bounded usage fields, timestamped cache telemetry records, and redacted segment hashes.
cache_layout_advice: ranked checks/experiments such as splitting long sessions or stabilizing early prompt prefixes, with observed issues kept separate from hypothesized or corroborated causes.
--feasibility-json / mac_visibility: a contract for local macOS-visible consumers. Only stable top-level fields are binding targets; summary is not a primary UI binding source.

These fields can flag likely volatile content near the prompt prefix, stable-prefix candidates, cache-miss hypotheses, and TTL/headroom evidence gaps. They do not print raw prompt text, do not prove provider cache hits, and may be missing, partial, hypothesis, or unavailable when transcript schemas do not expose enough evidence.

Watch context and cache health in the statusline

[Sonnet] repo | main | ctx 86% ⚠ | cost $0.123 | cache 80% | reuse 8.0x

cache N% is the cache-read share of observed input-side tokens in the bounded transcript tail and stays hidden until at least one cache read is observed. reuse X.Yx is cache_read / cache_creation and is shown only when cache read is positive and cache creation is non-zero. The ⚠ marker appears when context usage reaches the warning threshold, defaulting to 80%; set CONTEXT_GUARD_STATUSLINE_CTX_WARN=90 to tune it for a project or shell.

Run a repeatable benchmark

./plugins/context-guard/bin/context-guard-bench \
  --tasks bench/tasks.json --variants bench/variants.json --csv bench/results.csv \
  --ledger-jsonl bench/cost-shift.jsonl --report-json bench/report.json

Read the report through its claim boundaries before writing any savings statement:

Successful baseline/variant runs are compared by real tokens and cost_usd + external_cost_usd; byte reductions stay proxy evidence.
Token-savings claims require primary_tokens_measured on both sides of a matched task.
matched_pair_evidence links each successful task bucket to the transform, measurement availability, quality gate, and claim boundary.
wall_time_seconds, provider_cached_tokens, and provider_cached_tokens_measured are diagnostic telemetry, not proof of ContextGuard-caused token or cost savings.
Optional self_hosted_metrics from provider payloads are stored as per-row JSONL ledger sidecars, kept out of CSV/report summaries, and must not be folded into hosted API token/cost savings claims.
If cost fields are zero or unavailable, the report can still mark token savings but will not claim shifted-cost savings.
CSV schemas are strict; after upgrading the benchmark helper, start a new --csv file or migrate the header named in the mismatch error.

See docs/benchmark-report.example.json for a minimal report-shape example, docs/benchmark-workflow-examples.md for workflow-specific synthetic examples, and docs/experimental-benchmark-fixtures.md for fixture-only experimental task/variant starters.

Manage experimental opt-ins

Experimental lanes are default off. The registry records project-local intent and metadata only; enabling an experiment does not activate stable runtime behavior by itself. Later helpers must still require explicit experimental flags before using these lanes.

context-guard experiments list
context-guard experiments status --json
context-guard experiments plan context-diff-compaction --json < change.diff
context-guard experiments emit context-diff-compaction --receipt-id <artifact-id> --reexpand-command "context-guard-artifact get <artifact-id> --full" --replacement-file compact-diff.txt --json < change.diff
context-guard experiments plan visual-crop-ocr --json --full-evidence-receipt <id> --crop-label <label> --crop-bounds 0,0,100,100 --image-size 800,600 --missed-context-note "outside crop omitted"
context-guard experiments emit visual-crop-ocr --json --full-evidence-receipt <id> --crop-label <label> --crop-bounds 0,0,100,100 --image-size 800,600 --ocr-text "visible text" --ocr-confidence 0.9 --ocr-error-note "glyph may be uncertain" --missed-context-note "outside crop omitted"
context-guard experiments plan learned-compression --json --sanitized --trusted-source --exact-fallback-receipt <id> --reexpand-command "context-guard-artifact get <id> --full" < sanitized-prose.txt
context-guard experiments emit learned-compression --json --sanitized --trusted-source --exact-fallback-receipt <id> --reexpand-command "context-guard-artifact get <id> --full" --replacement-file compact-prose.txt < sanitized-prose.txt
context-guard experiments plan self-hosted-metrics-ledger --json --latency-ms 123.5 --peak-memory-mb 2048 --quality-score 0.98
context-guard experiments record self-hosted-metrics-ledger --ledger-jsonl .context-guard/self-hosted-metrics.jsonl --latency-ms 123.5 --peak-memory-mb 2048 --quality-score 0.98 --json
context-guard experiments plan local-proxy --json --bind-host 127.0.0.1 --target-host 127.0.0.1 --runtime-gate-ack
context-guard experiments plan local-proxy-external-forwarding --external-forwarding-intent --external-forwarding-design-ack --allow-host api.example.com --allow-scheme https --credential-redaction-policy strip-sensitive-headers --provider-evidence-boundary diagnostic-only-provider-measured-required --threat-model-note "Only user-owned HTTPS endpoint; sensitive headers are stripped before any future forwarding." --json
context-guard experiments record local-proxy-runtime-gate --ledger-jsonl .context-guard/local-proxy-gates.jsonl --bind-host 127.0.0.1 --target-host 127.0.0.1 --runtime-gate-ack --json
context-guard experiments serve local-proxy --bind-host 127.0.0.1 --bind-port 18080 --target-host 127.0.0.1 --target-port 18081 --runtime-gate-ack --forwarding-gate-ack --once --ready-file .context-guard/local-proxy-ready.json --diagnostic-ledger-jsonl .context-guard/local-proxy-diagnostics.jsonl --json
context-guard experiments enable output-receipt-trim --root .
context-guard experiments disable output-receipt-trim --root .

The local-proxy examples are intentionally split by side effect:

plan local-proxy produces advisory metadata only; it does not enable forwarding.
record local-proxy-runtime-gate appends one localhost-only gate row and still starts no listener, forwards no traffic, persists no API keys, and makes no hosted-savings claim.
serve local-proxy is the separate MVP. It requires both runtime and forwarding acknowledgements plus --once, a private --ready-file nonce handoff for the forwarding client, binds only a literal loopback IP, forwards only to a literal loopback IP target, blocks credential-bearing requests, uses byte/time limits, uses literal IPs instead of hostname DNS targets, does not persist API keys, and does not support external forwarding, CONNECT/TLS proxying, or hosted-savings claims.
With --diagnostic-ledger-jsonl, serve appends one shifted-cost diagnostic row only after a successful forwarded request. The row stores hashes/metadata rather than raw headers, request bodies, response bodies, or hosted-savings evidence.
plan local-proxy-external-forwarding is a dry-run design gate only. It requires explicit external intent, design acknowledgement, HTTPS host allowlist, threat model notes, credential redaction policy, and provider-evidence boundary, but starts no listener, performs no DNS lookup, calls no external service, forwards no traffic, persists no credentials, and does not ship an external proxy forwarding runtime.

By default, project settings are stored in .context-guard/experiments.json. Use --config <path> only for an explicit project-local override. Experiment metadata includes risk level, gate requirements, explicit command/flag surfaces, and claim boundaries so hosted API token/cost savings are not claimed without provider-measured matched-task evidence. experiments enable records intent only; it does not run helpers, remove the need for their explicit flags, or permit replacing content without exact receipt/re-expand evidence.

Shipped experimental checker/planner surfaces, plus explicit local context-diff, visual evidence, learned-candidate, metrics, and proxy-gate record runtimes, are intentionally narrow:

Planner/checker/runtime	What it emits	Hard boundary
`context-diff-compaction`	Dry-run diff advice plus an explicit `emit ... --receipt-id ... --reexpand-command ...` runtime for caller-supplied compact replacements.	`plan` emits no replacement. `emit` requires reviewable hunks, exact local artifact re-expand metadata whose stored content matches the input diff, and a smaller caller-supplied replacement; ContextGuard does not generate semantic compression or support hosted token/cost savings claims.
`visual-crop-ocr`	Dry-run visual evidence advice plus an explicit `emit visual-crop-ocr` runtime for caller-supplied evidence packs.	`emit` requires a full visual evidence receipt, missed-context note, and complete user-supplied crop and/or OCR evidence; ContextGuard does not capture screenshots, crop images, run OCR, parse images, call external services, write files, or support hosted token/cost savings claims.
`learned-compression`	Deny-by-default policy checks plus an explicit `emit learned-compression` runtime for caller-supplied compact prose candidates with verified exact fallback content.	`emit` requires sanitized trusted prose, protected-signal denial, a verified local fallback artifact matching the input, and a smaller caller-supplied prose candidate; ContextGuard does not run compressors, embeddings, rerankers, model calls, subprocesses, external services, generated replacement text, or hosted savings claims.
`self-hosted-metrics-ledger`	Dry-run preview plus an explicit `record ... --ledger-jsonl` runtime for local/model-server latency, memory, quality, energy, throughput, and local-cost metrics.	The dry-run preview does not write a ledger; the explicit record command writes only local JSONL sidecars and still does not support hosted API token/cost savings claims.
`local-proxy`	Localhost-only advisory metadata, design-only `plan local-proxy-external-forwarding` review for future external forwarding, an explicit `record local-proxy-runtime-gate --ledger-jsonl` runtime for one local gate row, an explicit one-shot `serve local-proxy` loopback forwarding MVP, and optional `--diagnostic-ledger-jsonl` shifted-cost diagnostics for successful forwarded requests.	`plan` writes no ledger. `record` writes only after localhost-only metadata and `--runtime-gate-ack`; it starts no listener, forwards no traffic, and performs no DNS lookup. `serve` additionally requires `--forwarding-gate-ack --once`, a private `--ready-file` nonce handoff, literal loopback bind/target IPs, nonzero ports, bounded bytes/timeouts, and credential-free requests; it performs no external forwarding, no CONNECT/TLS proxying, no API-key persistence, and no hosted-savings claim. `--diagnostic-ledger-jsonl` writes only successful-forward diagnostics with no raw headers/bodies and no hosted-savings claim. `plan local-proxy-external-forwarding` emits threat-model/allowlist/redaction/provider-evidence design metadata only and still performs no DNS lookup, external service call, traffic forwarding, credential persistence, or hosted-savings claim.

What is not yet shipped

These are directions the project has tracked, not committed features. Nothing here ships unless documented elsewhere in the repository.

ContextGuard does not yet ship:

learned/synthetic compressor execution or generated replacement text beyond the caller-supplied learned candidate emitter
generated crop/OCR or visual-token pruning runtime beyond the caller-supplied visual evidence-pack emitter
self-hosted KV/latent optimization beyond explicit local metrics recording
external, daemon, or credential-bearing proxy forwarding beyond the one-shot literal-loopback local proxy MVP

See the experimental token-reduction radar and fixture-only experimental benchmark starters. Those lanes remain experimental/non-shipped under the later-roadmap gate until matched successful tasks, failure-rate guardrails, human-correction tracking, shifted-cost accounting, provider-measured token/cost evidence, and separate future PR gates justify any hosted API savings claim or broader runtime feature claim.

Repository layout

.claude-plugin/marketplace.json — Claude Code marketplace manifest.
plugins/context-guard/ — installable Claude Code plugin package.
context-guard-kit/ — checkout-local Python/Bash helper sources. npm packages ship synchronized plugins/context-guard/bin and plugins/context-guard/lib copies instead of duplicating this source tree.
docs/index.html — static landing page for the project.
tests/ — regression tests for helper behavior.

Local development

Run Claude Code with the plugin directory:

claude --plugin-dir ./plugins/context-guard

Test marketplace installation from the repository root:

/plugin marketplace add ./
/plugin install context-guard@context-guard

Plugin helper binaries are not added to PATH by default. For local testing, invoke them by full path:

./plugins/context-guard/bin/context-guard-setup --plan
./plugins/context-guard/bin/context-guard-setup --agent codex --brief-mode standard --plan
./plugins/context-guard/bin/context-guard-setup --yes

To use shorter commands during local development, add the plugin bin directory to your shell:

export PATH="$PWD/plugins/context-guard/bin:$PATH"
context-guard-setup --plan

Do not rely on PATH lookup for generated hooks by default. The setup wizard records explicit bundled or checkout-local helper paths; --allow-path-helper-fallback is only for trusted external installs and validates the resolved helper before writing commands.

Release checks

Before publishing or merging release-sensitive changes, run the copy check and both gates:

python3 scripts/sync_plugin_copies.py --check
python3 scripts/prepublish_check.py
python3 scripts/release_smoke.py

When a helper under context-guard-kit/ changes, run python3 scripts/sync_plugin_copies.py --write before the gates. sync_plugin_copies.py --check verifies the maintainer-facing exact-copy contract up front. npm packages intentionally ship only the synchronized plugin-local plugins/context-guard/bin entrypoints and plugins/context-guard/lib helpers to avoid duplicate implementation payloads. prepublish_check.py verifies package invariants, synchronized plugin binaries, manifests, diagnostic redaction, and the regression suite. release_smoke.py executes representative packaged entrypoints from plugins/context-guard/bin in a temporary project so broken CLI wiring is caught before publish. See docs/release-runbook.md for the full release workflow, evidence checklist, quad-review requirement, and rollback checklist.

Versioned release notes live in CHANGELOG.md; the prepublish gate requires an entry matching the plugin manifest version before publishing.

Name		Name	Last commit message	Last commit date
Latest commit History 505 Commits
.claude-plugin		.claude-plugin
.github/workflows		.github/workflows
apps/contextguard-mac		apps/contextguard-mac
context-guard-kit		context-guard-kit
docs		docs
packaging/homebrew		packaging/homebrew
plugins/context-guard		plugins/context-guard
research		research
scripts		scripts
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
NOTICE		NOTICE
README.ko.md		README.ko.md
README.md		README.md
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

ContextGuard

TL;DR

Claude Code first, other agents too

How ContextGuard reduces token waste

How it fits with caching and compression tools

Brief mode (advisory)

What to measure

What ContextGuard does not do

Features

Cost guard key provisioning

Install in Claude Code

Install with npm/npx

Homebrew release path

Helper commands

Health check before setup

Scan context management

Diagnose structural context waste

Read symbols instead of whole large files

Store and query large logs locally

Build a budgeted context pack

Prune a tool/MCP catalog for a task

Score static prompt cacheability

Compress selected local text conservatively

Trim or summarize command output

Sanitize search and diff output

Apply an opt-in declarative output filter

Audit local transcript usage

Watch context and cache health in the statusline

Run a repeatable benchmark

Manage experimental opt-ins

What is not yet shipped

Repository layout

Local development

Release checks

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages