MCP hardening, rerank god-class dampening, config/IaC labeling + roadmap sync#9
Conversation
Roadmap chunk C (large features, landed behind the benchmark gate). - rerank: replace the linear in_degree bonus (saturated by in_degree=10, gave 100-caller "god classes" the full bonus) with a logarithmic curve + lower cap, so centrality stays a tiebreak instead of floating god classes above relevant low-degree matches. Validated as no-regression on the public benchmark (Recall@k/MRR/nDCG unchanged) plus a targeted regression test. - discovery/classify: label Dockerfile/Containerfile, Terraform (.tf/.tfvars), HCL, INI (.ini/.cfg/.conf/.properties) and Makefiles (Tier-C, FTS-only). These were already FTS-indexed as unknown text; labeling surfaces them in stats and lets agents scope searches to config. - docs: typed-framework-edges design spec (M13 documented-first deliverable); LANGUAGES.md Tier-C row updated. - regenerate tests/golden/search_token.json (one score shifted; order unchanged). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… on mcp>=1.27
Roadmap chunk B (MCP hardening).
- wrap every tool payload (success and error) in {schema_version: 1, tool: <name>};
closes the docs/MCP.md follow-ups and makes the ARCHITECTURE.md schema_version
claim true.
- golden snapshots for all 7 tools (tests/golden/mcp_*.json + test_mcp_golden.py);
schema_version/tool asserted explicitly so a golden can't freeze a wrong version.
- fix: MCP server failed to import on mcp>=1.27 + pydantic>=2.10 (FastMCP built a
structured-output schema from the `-> str` return annotation and raised). Register
tools as unstructured (structured_output=False where supported; older mcp detected).
- golden_utils: mask package_version so the healthcheck golden survives version bumps.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Roadmap chunk A (roadmap sync + docs). - docs/ROADMAP.md: M10 MCP bridge marked shipped (was "planned"); reconcile the technical vs product milestone numbering instead of claiming one is canonical. - "Trust model in 60 seconds" callout, identical in README.md and docs/SECURITY.md. - PRODUCT_UPGRADE_PLAN.md: mark shipped items (schema_version, golden snapshots, in_degree dampening, config/IaC labeling, trust-model doc). - CHANGELOG.md: Unreleased entries for chunks A/B/C and the MCP import fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 44e03d4a98
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Config / IaC (Tier C: line-chunk + FTS, no tree-sitter spec). These were already | ||
| # indexed as unknown-language text; labeling them surfaces infra files in `stats` | ||
| # and lets agents scope searches to config without a tree-sitter grammar. | ||
| ".tf": "terraform", |
There was a problem hiding this comment.
Include line-parser labels in stats
Adding these Tier-C labels does not make them visible in codebase-index stats/MCP index_stats: stats_payload() builds its per-language list only from repo.treesitter_coverage(), whose SQL filters WHERE f.parser = 'treesitter', while parser_for('terraform') and the other new config labels return line. In a repo with .tf/Dockerfile/INI files, the files are counted only in the aggregate and the new languages are omitted from the advertised per-language stats, so the shipped config/IaC labeling is incomplete unless stats also aggregates line-parser languages.
Useful? React with 👍 / 👎.
Roadmap follow-through across three chunks (Distribution intentionally out of scope — needs maintainer accounts/creds). No commit touches
.skill_versionstamps.A — Roadmap sync + docs
docs/ROADMAP.md: M10 (MCP bridge) marked shipped (was planned); reconciled the technical (M0–M10) vs product (M0–M13) milestone numbering instead of claiming one is canonical.README.mdanddocs/SECURITY.md(closesPRODUCT_UPGRADE_PLAN §7).docs/ARCHITECTURE.md §8: theschema_versionclaim is now true (was asserted but unimplemented).PRODUCT_UPGRADE_PLAN.md+CHANGELOG.mdupdated to mark only what actually shipped.B — MCP hardening
{schema_version: 1, tool: <name>}.tests/golden/mcp_*.json+tests/test_mcp_golden.py);schema_version/toolasserted explicitly so a golden can't freeze a wrong contract version.mcp>=1.27+pydantic>=2.10(FastMCP auto-built a structured-output schema from the-> strreturn annotation and raised at import time). Tools now register as unstructured (structured_output=Falsewhere supported; oldermcpdetected and unaffected) — same text-content wire contract.C — Large features (behind the benchmark gate)
in_degreetiebreak — linear bonus (saturated by in_degree 10) → logarithmic curve + lower cap. Validated no-regression on the public benchmark (Recall@k / MRR / nDCG unchanged) plus a targeted regression test that fails under the old linear rule. Real-repo gain still tracked under M12.5..tf/.tfvars), HCL, INI (.ini/.cfg/.conf/.properties), Makefiles now get a real language label (Tier-C, FTS-only). Already indexed as unknown text; labeling surfaces them instatsand lets agents scope to config.docs/superpowers/specs/2026-06-14-typed-framework-edges-design.md— the documented-first deliverable the repo requires before this risky graph work can land (schema, confidence/provenance, resolver architecture, benchmark gate, phasing). Implementation not started.Verification
360 passed, 6 skipped, 1 failed— the single failure is the pre-existinglibtorchcodec/FFmpeg environment issue (fails identically onmain, confirmed viagit stash).ruffclean.🤖 Generated with Claude Code