denfry · denfry · Jun 14, 2026 · Jun 14, 2026 · Jun 14, 2026 · Jun 14, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -18,8 +18,37 @@ All notable changes to this project are documented here. The format is based on
 - **`docs/RELEASE_CHECKLIST.md`**: a repeatable release checklist (version sync,
   tests, benchmarks, doctor, install/plugin/MCP smoke, changelog) with signed
   checksums + SBOM tracked as future hardening.
+- **MCP contract hardening (M11.5)**: every MCP tool payload — success *and* the
+  no-index/error path — is now wrapped in a stable envelope (`schema_version`: 1,
+  `tool`: <name>). Golden snapshots lock every tool's output
+  (`tests/golden/mcp_*.json` via `tests/test_mcp_golden.py`), and the contract
+  values are asserted explicitly so a golden can't freeze a wrong version. Closes
+  the long-standing `docs/MCP.md` follow-ups and makes the `schema_version` claim
+  in `docs/ARCHITECTURE.md` §8 true.
+- **Config / IaC language labeling**: Dockerfile, Containerfile, `*.tf`/`*.tfvars`
+  (terraform), `*.hcl`, `*.ini`/`*.cfg`/`*.conf`/`*.properties` (ini), and
+  Makefiles now get a real language label. These files were already FTS-indexed as
+  unknown text; labeling surfaces infra files in `stats` and lets agents scope
+  searches to config. They stay on the line/FTS floor (no tree-sitter spec).
+- **Typed framework edges — design doc**
+  (`docs/superpowers/specs/2026-06-14-typed-framework-edges-design.md`): the
+  documented-first deliverable for the M13 code-intelligence graph
+  (route→handler→service→model, test→impl, config→consumer, …) with a schema,
+  confidence/provenance model, resolver architecture, and a benchmark gate.
+- **"Trust model in 60 seconds"** callout, identical in `README.md` and
+  `docs/SECURITY.md`.
 
 ### Changed
+- **Reranker: dampened the god-class `in_degree` tiebreak** (`retrieval/rerank.py`).
+  The graph-centrality bonus is now logarithmic with a lower cap instead of linear
+  (which saturated by in_degree 10, giving 100-caller "god classes" the full bonus
+  and floating them above genuinely relevant low-degree matches on stray-term ties).
+  Validated as no-regression on the public benchmark (Recall@k / MRR / nDCG
+  unchanged) with a targeted regression test; the real-repo gain on the honest Java
+  misses is tracked under M12.5. CLI/MCP `search` goldens regenerated accordingly.
+- **`docs/ROADMAP.md`**: M10 MCP bridge marked shipped (was "planned"); reconciled
+  the technical-vs-product milestone numbering instead of claiming one is canonical.
+
 - **README**: added "Who Is It For?" and a "How Is This Different?" section that
   answers why-not-grep / Cursor / Aider repo-map / Sourcegraph / Codebase-Memory
   MCP on the first screen, plus a proven-today-vs-roadmap table.
@@ -30,6 +59,12 @@ All notable changes to this project are documented here. The format is based on
   TODO-friendly benchmark task checklist with a no-overclaim procedure.
 
 ### Fixed
+- **MCP server failed to import on `mcp>=1.27` + `pydantic>=2.10`**: newer FastMCP
+  auto-built a structured-output schema from each tool's `-> str` return annotation
+  and raised `PydanticUserError` at import time, breaking the server and its test
+  suite. Tools now register as unstructured (`structured_output=False` where the
+  kwarg exists; older `mcp` is detected and unaffected), preserving the existing
+  text-content wire contract.
 - `docs/FAQ.md`: removed a dangling/duplicated sentence in "Is it
   production-ready?" and documented the real `clean` / `clean --all` behavior.
 

diff --git a/README.md b/README.md
@@ -429,6 +429,14 @@ Answer with precise file:line citations
 
 ## Safety and Privacy
 
+> **Trust model in 60 seconds**
+> 1. **Offline by default** — the base install has zero network dependencies; nothing leaves your machine.
+> 2. **One opt-in exit, triple-gated** — external embeddings require `allow_external` **and** an env API key **and** a printed endpoint warning, or they are refused.
+> 3. **Secrets never get in** — `.env`, keys, certs, and credential files are excluded before parsing (multi-gate ignore pipeline).
+> 4. **Secrets never get out** — every snippet is redacted (AWS keys, private keys, JWTs, bearer tokens, connection strings) before it reaches the agent.
+> 5. **No telemetry, ever** — no analytics, no phone-home, no usage data.
+> 6. **Verify it yourself** — `codebase-index doctor --strict` audits all of the above and exits non-zero in CI on any high-severity finding.
+
 `codebase-index` is designed with privacy as a first principle:
 
 - **No telemetry** — No usage data, analytics, or crash reports are collected or transmitted.

diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
@@ -199,7 +199,8 @@ Current implementation:
 - `src/codebase_index/mcp/server.py` is a thin adapter over `retrieval/`, `storage/`, and
   `indexer/freshness.py`.
 - `codebase-index mcp --root <repo>` runs the stdio server.
-- JSON payloads include `schema_version`.
+- Every JSON payload (including the error path) carries a `schema_version` + `tool` envelope,
+  locked by golden snapshots (`tests/golden/mcp_*.json`).
 - [MCP.md](MCP.md) provides config templates for Claude Desktop, Claude Code, Cursor, VS Code,
   Zed, and Windsurf.
 - `healthcheck` lets MCP clients distinguish "server running", "index missing",

diff --git a/docs/LANGUAGES.md b/docs/LANGUAGES.md
@@ -6,7 +6,7 @@
 |---|---|---|
 | Tier A | Language-specific Tree-sitter `LangSpec` with definition, call, and import/inheritance patterns | Python, JavaScript, TypeScript, Java, Go, Rust, C, C++, C#, Ruby, PHP, Kotlin |
 | Tier B | Generic Tree-sitter path when a loadable grammar exists, without language-specific graph semantics | Lua |
-| Tier C | Line chunks + FTS5 lexical search only | Markdown, JSON, YAML, TOML, SQL and other text/config files |
+| Tier C | Line chunks + FTS5 lexical search only | Markdown, JSON, YAML, TOML, SQL; config/IaC: Dockerfile, Terraform (`.tf`/`.tfvars`), HCL, INI (`.ini`/`.cfg`/`.conf`/`.properties`), Makefiles; and other text/config files |
 
 Tier A is the only tier that should be advertised as symbol-aware. Tier B can
 surface useful definitions, but it is intentionally weaker and should be called
@@ -45,7 +45,11 @@ High-priority code languages:
 - Objective-C
 - Vue and Svelte component structure
 
-High-priority non-code and framework-aware extraction:
+High-priority non-code and framework-aware extraction (config/IaC files are now
+**Tier-C labeled** — indexed, language-tagged, and FTS-searchable; the items below
+are the deeper *structured* extraction still on the roadmap, and the framework
+graph part is designed in
+`docs/superpowers/specs/2026-06-14-typed-framework-edges-design.md`):
 
 - SQL schema-aware parsing: tables, columns, migrations, model/query consumers
 - Terraform/HCL: resources, modules, variables, outputs

diff --git a/docs/MCP.md b/docs/MCP.md
@@ -41,30 +41,45 @@ The MCP server exposes the same retrieval contract as the CLI.
 
 ## Output contract
 
-Tool responses are JSON strings returned through MCP content blocks. The
-intended stable shape for retrieval responses is:
+Tool responses are JSON strings returned through MCP content blocks. **Every**
+payload — success or error — is wrapped in a stable envelope so clients can
+branch on the contract without sniffing the shape:
 
 ```json
 {
+  "schema_version": 1,
+  "tool": "search_code",
   "index": {
     "exists": true,
     "stale": false,
     "built_at": "2026-05-29T12:00:00Z",
     "files_changed_since_build": 0
   },
   "results": [],
-  "recommended_reads": [],
-  "warnings": []
+  "recommended_reads": []
 }
 ```
 
+- `schema_version` (int) — the payload contract version. Bumped only on a
+  breaking change (field removal or type change); additive fields keep the same
+  version. The current version is **1**.
+- `tool` (string) — the emitting tool name (`search_code`, `find_symbol`,
+  `find_refs`, `impact_of`, `explain_code`, `index_stats`, `healthcheck`).
+- The no-index / error path carries the same envelope plus an `"error"` field.
+
 Rules:
 
-- Additive fields are allowed within a tool output version.
-- Field removal or type changes should be treated as a protocol change.
+- Additive fields are allowed within a `schema_version`.
+- Field removal or type changes bump `schema_version`.
 - Tool descriptions should include examples and expected failure modes.
 - Errors should fail closed: no partial unsafe result when config or index state is unsafe.
 
+Every tool's enveloped output is locked by golden snapshots in
+`tests/golden/mcp_*.json` (regenerate intentionally with
+`UPDATE_GOLDEN=1 pytest tests/test_mcp_golden.py`), and the `schema_version` /
+`tool` values are asserted explicitly so a golden can never silently freeze a
+wrong contract version.
+
 ## Client config templates
 
 ### Claude Desktop
@@ -143,8 +158,12 @@ same trust boundaries:
 - Done: `healthcheck`, `search_code`, `find_symbol`, `find_refs`, `impact_of`, `explain_code`,
   and `index_stats` tools.
 - Done: focused tests for tool registration, missing-index behavior, config resolution, and run entrypoint.
-- Follow-up: explicit schema/version field in every structured tool payload.
-- Follow-up: golden snapshots for every tool output.
+- Done: explicit `schema_version` + `tool` envelope on every structured tool payload (including the
+  error path), asserted by `tests/test_mcp_server.py` and `tests/test_mcp_golden.py`.
+- Done: golden snapshots for every tool output (`tests/golden/mcp_*.json`).
+- Done: unstructured-output registration (`structured_output=False` where supported) so the server
+  loads on `mcp>=1.27` + `pydantic>=2.10`, where auto-detecting a structured schema from the `-> str`
+  return annotation otherwise raises at import time.
 - Follow-up: verified client-specific docs for Claude Desktop, Claude Code, Cursor, VS Code, Zed,
   and Windsurf.
 - Follow-up: paging or progressive result support.
diff --git a/docs/PRODUCT_UPGRADE_PLAN.md b/docs/PRODUCT_UPGRADE_PLAN.md
@@ -89,7 +89,7 @@ transparent Python implementation, a strict privacy model, and honest benchmarks
 | Weakness | Impact | Plan |
 |---|---|---|
 | No large-scale real-repo benchmark | Can't claim 100k/1M LOC quality | Benchmark tasks §8; recruit public repos |
-| Graph is import/call/ref only | `impact` misses framework wiring | ARCHITECTURE §9 typed-edge roadmap |
+| Graph is import/call/ref only | `impact` misses framework wiring | ARCHITECTURE §9 + design doc `specs/2026-06-14-typed-framework-edges-design.md`; implementation behind §8 benchmark |
 | GitHub-only distribution | No `pip install codebase-index` / `uvx` | Distribution tasks §9 |
 | MCP client docs unverified | Templates may be wrong per client version | Verify against each client, add per-client docs |
 | Single-repo only | No monorepo/fleet context | Out of scope near-term; documented as non-goal |
@@ -101,12 +101,15 @@ transparent Python implementation, a strict privacy model, and honest benchmarks
    logs. Highest credibility lever.
 2. **Typed framework edges** (route→handler→service→model, test→impl, config→consumer)
    with source spans + confidence. Biggest product-quality lever for `impact`.
+   *Design approved this pass* (`specs/2026-06-14-typed-framework-edges-design.md`);
+   implementation gated on the §8 graph benchmark.
 3. **Distribution hardening**: PyPI publish, `uvx`/`pipx` story, signed checksums,
    SBOM. Lowers adoption friction and raises supply-chain trust.
-4. **MCP contract hardening**: `schema_version` on every payload, golden
-   snapshots per tool, verified client docs, paging/progressive results.
-5. **Retrieval tuning**: dampen the god-class `in_degree` tiebreak (the 3 honest
-   misses in the Java run), per-intent weights review.
+4. **MCP contract hardening**: ✅ `schema_version` on every payload + golden
+   snapshots per tool (this pass). Remaining: verified client docs, paging/progressive results.
+5. **Retrieval tuning**: ✅ dampened the god-class `in_degree` tiebreak this pass
+   (log curve + lower cap, validated no-regression on the public suite). Remaining:
+   confirm the real-repo gain on the 3 honest Java misses (needs M12.5), per-intent weights review.
 6. **Language reach**: config/IaC awareness (Dockerfile, Terraform, migrations,
    CI), plus Swift/Dart/Scala/Vue/Svelte gaps called out in FAQ.
 
@@ -119,7 +122,7 @@ transparent Python implementation, a strict privacy model, and honest benchmarks
 - [x] `docs/BENCHMARKS.md` "claims not to make yet" + TODO benchmark checklist.
 - [x] `docs/RELEASE_CHECKLIST.md`.
 - [ ] Verified per-client MCP setup docs (after testing each client version).
-- [ ] A short "trust model in 60 seconds" callout reused across README/SECURITY.
+- [x] A short "trust model in 60 seconds" callout reused across README/SECURITY.
 
 ## 8. Benchmark tasks
 
@@ -150,14 +153,19 @@ Track in [BENCHMARKS.md](BENCHMARKS.md); none may be reported until run with log
 
 | # | Improvement | Impact | Risk | Status |
 |---|---|---|---|---|
-| 1 | Implement `clean` (documented but was a stub) | Fixes doc/reality gap | Low | **Shipped this pass** |
-| 2 | Dampen god-class `in_degree` tiebreak in rerank | +recall on real repos | Medium (retune) | Planned |
-| 3 | `schema_version` on every MCP payload | Stable contract | Low | Partly (architecture claims it) — verify+test |
-| 4 | Golden snapshots for each MCP tool output | Regression safety | Low | Planned |
-| 5 | Typed framework edges in the graph | Better `impact` | High | Roadmap (ARCHITECTURE §9) |
-| 6 | Config/IaC parsers (Dockerfile, Terraform, migrations) | Coverage | Medium | Roadmap |
+| 1 | Implement `clean` (documented but was a stub) | Fixes doc/reality gap | Low | **Shipped (1.3.0 line)** |
+| 2 | Dampen god-class `in_degree` tiebreak in rerank | +recall on real repos | Medium (retune) | **Shipped this pass** — log dampening + lower cap; no-regression on the public suite + a targeted regression test. Real-repo gain still needs M12.5. |
+| 3 | `schema_version` on every MCP payload | Stable contract | Low | **Shipped this pass** — `schema_version` + `tool` envelope on every payload (incl. errors), asserted + golden-locked. |
+| 4 | Golden snapshots for each MCP tool output | Regression safety | Low | **Shipped this pass** — `tests/golden/mcp_*.json` via `tests/test_mcp_golden.py`. |
+| 5 | Typed framework edges in the graph | Better `impact` | High | Design doc shipped this pass (`docs/superpowers/specs/2026-06-14-typed-framework-edges-design.md`); implementation behind the §8 benchmark. |
+| 6 | Config/IaC parsers (Dockerfile, Terraform, migrations) | Coverage | Medium | **Partly shipped this pass** — Tier-C labeling for Dockerfile/Terraform/HCL/INI/Make (already FTS-indexed, now language-labeled); tree-sitter parsing of these still roadmap. |
 | 7 | Paging/progressive MCP results | Big-repo UX | Medium | Roadmap (MCP.md) |
 
+Also fixed this pass (not previously tracked): the MCP server failed to import on
+`mcp>=1.27` + `pydantic>=2.10` (FastMCP auto-built a structured-output schema from
+the `-> str` return annotation and raised). Tools now register as unstructured
+(`structured_output=False` where supported), so the server loads on current `mcp`.
+
 Rule for this repo: small, safe, tested changes land directly; anything that
 risks destabilizing retrieval quality or the security model is documented here
 first and lands behind a benchmark.
diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md
@@ -1,8 +1,13 @@
 # Roadmap & First Implementation Tasks
 
 Milestones are vertical-ish slices: each ends with something runnable and testable.
-This numbering is canonical — the product-level [ROADMAP.md](../ROADMAP.md) and the
-`(Mx)` tags in [CHANGELOG.md](../CHANGELOG.md) follow it.
+This is the **technical-milestone** view (M0–M10). The product-level
+[ROADMAP.md](../ROADMAP.md) tells the same story at a finer grain and carries it
+further (it splits the MCP server into M11 and adds M11.5/M12/M12.5/M13 for MCP
+hardening, benchmarks, and the typed-edge graph). Where the two disagree on a
+number, the product roadmap is the current product view; this file tracks the
+original implementation slices. The `(Mx)` tags in
+[CHANGELOG.md](../CHANGELOG.md) follow this technical numbering.
 
 ## M0 — Architecture & scaffold ✅ (this repo)
 - Repo tree, docs (ARCHITECTURE/RETRIEVAL/SCHEMA/SECURITY/INSTALLATION), SKILL.md draft.
@@ -77,11 +82,15 @@ release with the built artifacts (GitHub-only distribution — no PyPI publish).
 "git+https://github.com/denfry/codebase-index.git@v1.2.0"` -> `init` -> `index` -> ask a question is
 verified end-to-end by `scripts/release_smoke.py`.*
 
-## M10 — Optional MCP bridge (planned)
-- Model Context Protocol server exposing `search`, `symbol`, `refs`, `impact` as tools for
-  MCP-compatible clients (Claude Desktop, Cursor, etc.). An optional addition, not a replacement
-  for the Skill/CLI interface.
-- **Exit:** `codebase-index` can be used as an MCP tool by any MCP-compatible client.
+## M10 — MCP bridge ✅ (product roadmap M11)
+- Shipped: a stdio Model Context Protocol server (`codebase-index mcp --root <repo>`, or the
+  `codebase-index-mcp` entry point) exposing `healthcheck`, `search_code`, `find_symbol`,
+  `find_refs`, `impact_of`, `explain_code`, and `index_stats` over the same `service.py` layer the
+  CLI uses — an optional addition, not a replacement for the Skill/CLI interface. Every payload
+  carries a `schema_version` + `tool` envelope, locked by golden snapshots (`tests/golden/mcp_*.json`).
+- **Exit:** `codebase-index` can be used as an MCP tool by any MCP-compatible client. See
+  [MCP.md](MCP.md).
+- Follow-up (product roadmap M11.5): verified per-client setup docs and paging/progressive results.
 
 ---
 

diff --git a/docs/SECURITY.md b/docs/SECURITY.md
@@ -3,6 +3,16 @@
 `codebase-index` is **local-first and offline by default**. Its threat model assumes the indexed
 repository may contain secrets and that a skill must not exfiltrate code or run dangerous commands.
 
+> **Trust model in 60 seconds**
+> 1. **Offline by default** — the base install has zero network dependencies; nothing leaves your machine (§1, §4).
+> 2. **One opt-in exit, triple-gated** — external embeddings require `allow_external` **and** an env API key **and** a printed endpoint warning, or they are refused (§4).
+> 3. **Secrets never get in** — `.env`, keys, certs, and credential files are excluded before parsing (§2).
+> 4. **Secrets never get out** — every snippet is redacted before it reaches the agent (§3).
+> 5. **No telemetry, ever** — no analytics, no phone-home, no usage data.
+> 6. **Verify it yourself** — `codebase-index doctor --strict` audits all of the above and gates CI (§6).
+>
+> The same callout appears in the README so the trust story is identical wherever a reader lands.
+
 ## 1. Principles
 
 1. **Local-first** — index, query, and storage all happen on the user's machine.