feat(v2.1): map structural skeleton + search metadata surface#95
feat(v2.1): map structural skeleton + search metadata surface#95PatrickSys merged 4 commits intomasterfrom
Conversation
- Add Key Interfaces section: top types/interfaces/classes by import centrality with signature hints - Add API Surface section: top 5 named exports per entrypoint file - Add Dependency Hotspots section: top 5 files by combined import+importedBy count - Enrich Architecture Layers with hub file + top exports per layer - Add --export flag: writes CODEBASE_MAP.md to project root - Update fixture relationships.json with exports field - Full test coverage for new sections + graceful degradation + --export flag - Gitignore CODEBASE_MAP.md (generated) and repos/ (benchmark fixtures) All data derived from existing index.json + relationships.json — no new I/O.
…y regression search-codebase: compact results now include symbol, symbolKind, scope, signaturePreview; full results include chunk imports/exports/complexity. Surfaces reranker health in searchQualityBlock when unavailable. reranker: add RerankerStatus type + getRerankerStatus() export. Add cache-corruption detection (Protobuf/parse errors trigger cache clear). Fix retry regression: replace initPromise=null reset with initFailed guard so failed loads fast-fail on subsequent calls instead of retrying the expensive model download — restoring test suite stability.
Greptile SummaryThis PR adds a structural skeleton to
Confidence Score: 4/5Safe to merge after fixing the full-mode imports/exports cast; all other changes are well-implemented and tested. One P1 defect: the full-mode chunk imports/exports feature is silently broken because SearchResult objects are never populated with those fields from CodeChunk. All other changes (reranker fix, map skeleton, compact metadata, --export flag) are correct and backed by 439/439 passing tests. src/tools/search-codebase.ts — lines 1134–1135 (dead import/export cast)
|
| Filename | Overview |
|---|---|
| src/core/reranker.ts | Adds initFailed guard that fast-fails after a permanent load failure, preventing repeated 15s timeout retries; logic is correct and handles concurrency via shared initPromise. |
| src/core/codebase-map.ts | Adds deriveKeyInterfaces, deriveApiSurface, deriveHotspots, enrichLayers — logic is sound; minor non-determinism in enrichLayers hub-file tie-breaking when importerCounts are equal. |
| src/tools/search-codebase.ts | Compact metadata fields (symbol, symbolKind, scope, signaturePreview) work correctly; full-mode chunk imports/exports are always undefined because SearchResult objects are never populated with those CodeChunk fields. |
| src/cli-map.ts | Adds --export flag that writes CODEBASE_MAP.md to rootPath, taking precedence over --json and --pretty; straightforward and well-tested. |
| src/types/index.ts | Adds CodebaseMapKeyInterface, CodebaseMapApiSurface, CodebaseMapHotspot types and updates CodebaseMapSummary.architecture — all clean and complete. |
| tests/cli-map-export.test.ts | New test covers --export writing, --export+--json precedence, and --export+--pretty precedence with proper tmp-dir isolation and cleanup. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[buildCodebaseMap] --> B[Read intelligence.json]
A --> C[Read relationships.json]
A --> D[Read index.json / chunks]
B --> E[activePatterns/bestExamples/graphStats]
C --> F[graphImports/graphImportedBy/graphExports]
D --> G[CodeChunk array]
F --> H[enrichLayers hub file + hubExports]
F --> I[deriveApiSurface exports per entrypoint]
F --> J[deriveHotspots top 5 by combined count]
G --> K[deriveKeyInterfaces top 10 by import centrality]
F --> K
H --> L[CodebaseMapSummary]
I --> L
J --> L
K --> L
E --> L
L -->|--export| M[CODEBASE_MAP.md]
L -->|--json| N[stdout JSON]
L -->|default| O[renderMapMarkdown stdout]
subgraph search [search_codebase full mode]
P[SearchResult r] -->|r.metadata.cyclomaticComplexity| Q[complexity works]
P -->|cast as unknown| R[r.imports and r.exports always undefined]
end
Reviews (1): Last reviewed commit: "chore: fix prettier formatting in codeba..." | Re-trigger Greptile
| const chunkImports = (r as unknown as { imports?: string[] }).imports?.slice(0, 5); | ||
| const chunkExports = (r as unknown as { exports?: string[] }).exports?.slice(0, 5); |
There was a problem hiding this comment.
Chunk
imports/exports always undefined in full mode
SearchResult objects are constructed in scoreAndSortResults without copying imports or exports from the source CodeChunk, so these type casts will always evaluate to undefined at runtime. The imports/exports fields silently never appear in full-mode output, making the stated feature a no-op.
To fix, either propagate these fields through SearchResult (add them to the type and populate them in scoreAndSortResults), or read them from r.metadata if that's where they're stored. r.metadata.cyclomaticComplexity already works because it is stored in ChunkMetadata.
There was a problem hiding this comment.
Verified: SearchResult has no top-level imports/exports fields, and scoreAndSortResults never copies them from CodeChunk. The cast is dead, and full-mode output silently drops both keys on every call.
The fix has structural implications beyond a review fixup — SearchResult.relationships.imports already exists as a file-level field, so adding chunk-level imports/exports requires naming decisions, a type audit across rescue/rerank/dedup paths, and proper regression test coverage. Deferring to Phase 15.5 in the v2.1 roadmap rather than patching in-review. Tracking as a confirmed gap; merging with this known limitation.
| return layers.map((layer) => { | ||
| let bestFile: string | undefined; | ||
| let bestCount = 0; | ||
| for (const [file, importers] of Object.entries(graphImportedBy)) { | ||
| if (file.split('/')[0] !== layer.name) continue; | ||
| if (importers.length > bestCount) { | ||
| bestCount = importers.length; | ||
| bestFile = file; | ||
| } | ||
| } | ||
| if (!bestFile) return layer; | ||
| const exps = graphExports[bestFile]; | ||
| const hubExports = exps | ||
| ? exps | ||
| .map((e) => e.name) | ||
| .filter((n) => n && n !== 'default') | ||
| .slice(0, 3) | ||
| : []; | ||
| return { | ||
| ...layer, | ||
| hubFile: bestFile, | ||
| ...(hubExports.length > 0 ? { hubExports } : {}) | ||
| }; | ||
| }); | ||
| } |
There was a problem hiding this comment.
Non-deterministic hub-file tie-breaking in
enrichLayers
When two files in the same layer share the same importerCount, the current > bestCount comparison preserves whichever candidate happens to be visited last (JS object key iteration order). A secondary alphabetical comparison would make the output deterministic across environments and rebuild cycles.
| return layers.map((layer) => { | |
| let bestFile: string | undefined; | |
| let bestCount = 0; | |
| for (const [file, importers] of Object.entries(graphImportedBy)) { | |
| if (file.split('/')[0] !== layer.name) continue; | |
| if (importers.length > bestCount) { | |
| bestCount = importers.length; | |
| bestFile = file; | |
| } | |
| } | |
| if (!bestFile) return layer; | |
| const exps = graphExports[bestFile]; | |
| const hubExports = exps | |
| ? exps | |
| .map((e) => e.name) | |
| .filter((n) => n && n !== 'default') | |
| .slice(0, 3) | |
| : []; | |
| return { | |
| ...layer, | |
| hubFile: bestFile, | |
| ...(hubExports.length > 0 ? { hubExports } : {}) | |
| }; | |
| }); | |
| } | |
| if (importers.length > bestCount || | |
| (importers.length === bestCount && bestFile !== undefined && file < bestFile)) { | |
| bestCount = importers.length; | |
| bestFile = file; | |
| } |
There was a problem hiding this comment.
Good catch. When two files in a layer share equal importerCount, JS object-iteration order determines the winner, which varies across JSON serializers and rebuild cycles. Bundling the alphabetic tie-break fix into Phase 15.5 alongside the P1 SearchResult shape work so both land with consistent test coverage.
Transient load failures (network, timeout, etc.) now reset initPromise=null so the next call can retry. Only Protobuf/parse/corrupt errors are marked permanently failed — those require a cache re-download in a new session. Long-lived MCP servers can now recover from transient load failures without requiring a restart. Addresses grey-area identified during PR #95 Greptile audit.
Summary
codebase-context mapnow outputs Key Interfaces, API Surface, and Dependency Hotspots sections alongside enriched Architecture Layers (hub file + top exports per layer). New--exportflag writesCODEBASE_MAP.mdto the project root. All data wired from existingindex.json+relationships.json— no new I/O.symbol,symbolKind,scope, andsignaturePreview. Full results add chunkimports/exports/complexity. Reranker health is surfaced insearchQualityBlockwhen unavailable.initPromise = nullon load failure was causing repeated slow model-load retries in tests (15s timeout × N calls). Replaced with aninitFailedguard that fast-fails subsequent calls after a permanent load failure.Test plan
npx tsc --noEmit— zero errorsnode dist/index.js map— Key Interfaces, API Surface, Dependency Hotspots sections presentnode dist/index.js map --export— CODEBASE_MAP.md written with all 10 sections