feat(topology): recover code duplication per sub-package with per-child engine routing#122
Merged
Merged
Conversation
…ld engine routing (#78) On an undeclared fan-out, `duplication` ran once over the whole tree, routed by whole-tree engine selection. Recover it per assessment root with the engine re-routed per child: a node package (own package.json + JS/TS-dominant source, read from the one cached scc walk sliced to the subtree) runs jscpd *in* the package (npm); otherwise lizard runs over the inventory file list sliced to the subtree (`inventory_paths_under`, CWD-relative). Records are namespaced via SLUG_NS (`svc/duplication`) and the console is labelled per package. `inventory_paths_under "."` is the identity (whole inventory, TARGET-relative) and the loop runs once at "." with SLUG_NS empty reusing the whole-tree engine, so a single package / declared workspace is byte-identical to before (verified end-to-end: all parsed records unchanged on a single-package repo). This is #78 Phase 2b, the duplication slice. Complexity (the eslint/lizard/scc merge) still runs whole-tree — recovering it per package is the next slice. Refs #78 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012oHR4g8pH7Ui242SRycFzw
maudlin
added a commit
that referenced
this pull request
Jun 25, 2026
…ine routing (#78) (#123) The last whole-tree measurement arm. On an undeclared fan-out, complexity ran once over the whole tree, routed by whole-tree engine selection. Recover it per assessment root with the FULL engine ladder re-routed per child (route_complexity_child, a scoped mirror of the whole-tree routing): ESLint on the JS/TS slice using the child's OWN flat config + local bin, lizard on the non-JS slice (inventory sliced to the subtree), scc fallback (keep-set sliced). Findings are namespaced (backend/complexity) and labelled per package. The single git-hotspots CSV is accumulated across packages — truncated once before the loop, appended per arm, always in TARGET-relative (namespaced) paths (scc/lizard/eslint findings re-prefixed; the standalone-lizard CSV's file column namespaced via awk) — so the churn × complexity join stays whole-tree. If nothing is measured the (empty) CSV is dropped, matching the pre-#78 absent state. Single package / declared workspace → one iteration at "." reusing the whole-tree routing, no cd, SLUG_NS empty, $PWD == $TARGET → byte-identical to before, CSV included (verified end-to-end on the lizard, scc and merged arms + full parsed-set diff). A fan-out routes each child to its own engine and accumulates a namespaced CSV. With stats (#121), duplication (#122) and now complexity, every measurement arm recovers per sub-package — #78 is complete. Closes #78 Claude-Session: https://claude.ai/code/session_012oHR4g8pH7Ui242SRycFzw Co-authored-by: Mark Ridley <210189+maudlin@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
On an undeclared fan-out,
duplicationran once over the whole tree, routed by whole-tree engine selection. This recovers it per assessment root, with the engine re-routed per child:package.json+ JS/TS-dominant source — read from the one cached scc walk sliced to the subtree) runs jscpd in the package (npm,cwd= the child);inventory_paths_under, CWD-relative).Records are namespaced via
SLUG_NS(svc/duplication); the console is labelled per package (📦 Sub-package: svc). Score accumulates per package, matching the increment-1 cluster precedent.This is #78 Phase 2b — the duplication slice. Complexity (the eslint/lizard/scc merge) still runs whole-tree; recovering it is the next, larger slice.
Why this is safe
inventory_paths_under "."is the identity (whole inventory, TARGET-relative), and the loop runs once at.withSLUG_NSempty reusing the whole-tree engine + probes (no cd) → a single package / declared workspace is byte-identical to before (the gate).Verification
test/source-inventory.test.sh— addedinventory_paths_undercases (.identity, prefix-strip CWD-relative, extension filter, identity matchesinventory_paths, no-match): 40 passed. Full CI suite green locally (12 suites).package.json+lockfile) → lizard arm (svc/duplication); a node child with noquality:duplicatesscript → jscpd arm, honest skip (web/duplication); both labelled..lizard path): theduplicationrecord + console block + all 27 parsed records identical betweenmainand this branch (only the absoluteCHECKUP_OUT_DIRpath inside onegit-hotspotsmessage string differs).Refs #78
🤖 Generated with Claude Code