A deterministic localiser of codebase-health problems β it tells you (or an agent) where the effort should go, before a line of code is read. A single shell entrypoint runs ~20 checks across code, dependencies, security, containers, CI and git history, and produces a machine-readable JSON stream (primary) plus a human-readable markdown report (always) β two signals: an overall health read and the biggest problems, ranked.
It is not a deploy gate β that's CI's job. Its value is front-loading, deterministically and up front, the gestalt a smart agent would otherwise spend tokens inferring ("this is Classic ASP", "there are no tests", "this module is hot, complex and bug-prone"). Cheaper, certain, reproducible, pre-token. See ADR-0009.
Four contexts, one job β "here's where the health problems are," never "may I ship?":
- Prime an AI agent β a deterministic "start here" before the expensive, non-deterministic agent runs.
- Team prioritisation β find the highest-leverage fix for today's pains and tomorrow's failures.
- Tech due diligence β a fast read of a product's code hygiene.
- Periodic safety-net sweep β a coarse-cadence catch of what slipped past CI.
Health is read across four pillars: maintainability (complexity, duplication, coupling, hotspots), safety/maturity (tests, coverage, mutation, docs β absence is a loud signal), currency & technology-viability (dependency rot, EOL runtimes, dead platforms), and correctness (does it build / pass β lowest weight, often unrunnable on a target you don't own).
It is tool-agnostic and portable: every check degrades gracefully when its tool is absent, and the contract documented below lets you swap the language-specific checks for your own stack's equivalents without touching the helpers, the renderer, or the report format. A grade is fine; a gate it is not.
Released under MIT (see
LICENSE). Forks, ports, and ports-back welcome.
# Run against the current project (resolves the enclosing git repo):
cd /path/to/your-project
/path/to/checkup/bin/checkup.sh
# β¦or scan an explicit target without cd-ing into it:
CHECKUP_TARGET=/path/to/your-project /path/to/checkup/bin/checkup.shOutput lands under the scanned project: docs/reports/checkup-report.md
(committable "latest") plus reports/parsed/*.json (machine-readable, one file
per check). Add an alias (alias checkup=/path/to/checkup/bin/checkup.sh) or
symlink bin/checkup.sh onto your PATH. To pin it into a project, vendor the
repo (e.g. as a git submodule) and call bin/checkup.sh from an npm/make task.
The checkup-core image bakes the cross-stack tools (gitleaks, semgrep,
shellcheck, yamllint, hadolint, scc) so you can examine any repository with
nothing installed but Docker β ideal for ad-hoc audits and due diligence:
docker build -t checkup-core . # one-off, from this repo
# Scan a project: source mounted READ-ONLY, report written to ./checkup-out
docker run --rm \
-v "/path/to/project:/src:ro" \
-v "$PWD/checkup-out:/out" \
checkup-core
# β ./checkup-out/checkup-report.md (+ parsed/*.json, by-file.json)The source is mounted read-only β checkup writes nothing into it; everything
goes to /out. Mount a full clone (not a shallow/exported tree) so the
git-forensics checks have history.
For sensitive or due-diligence scans, run it sealed (--network none + a
minimal sandbox) so a compromised tool can't exfiltrate the code β see
SECURITY.md.
What runs in checkup-core: the cross-stack security, hygiene and forensics
checks (secrets, SAST, shell/YAML/Dockerfile lint, stats, churn Γ complexity).
Language- and build-specific checks (typecheck, test, build, coverage) belong
to per-stack images β see ROADMAP.md. On a repo without the Node
toolchain they skip honestly (they don't fail or false-pass); read the
cross-stack sections for the core signal.
FROM checkup-core plus the .NET SDK, Microsoft DevSkim and PMD CPD. Runs every
core check, then adds four .NET / legacy-ASP passes β asp-classic (semgrep
ruleset for Classic ASP/VBScript), devskim (source SAST, no build, reaches
.NET Framework source), dotnet-vuln (dotnet list package --vulnerable,
skips honestly on legacy packages.config), and duplication (PMD CPD β
language-aware copy-paste detection for C# and other CPD languages; Classic ASP
has no CPD tokeniser). New findings flow into the same report automatically (the
renderer is tool-agnostic).
docker build -t checkup-core . # base first
docker build -f Dockerfile.dotnet -t checkup-dotnet . # overlay
docker run --rm -v "/path/to/app:/src:ro" -v "$PWD/out:/out" checkup-dotnetThe report location is controlled by CHECKUP_OUT_DIR (set to /out in
the image): set it in any context to write outputs outside the scanned tree.
Unset, checkup keeps the committed docs/reports/checkup-report.md convention.
checkup's first-class use is front-loading an AI coding agent: run it, then
hand the result to the agent as a briefing so it starts in the right place,
the right way β before it spends a token reading code. The agent-first artefact
is reports/checkup.json (a single versioned bundle; see
architecture).
A prompt that turns the report into safe, prioritised action:
A checkup health report exists for this codebase. Start with reports/checkup.json
β the bundled signal β before reading source:
β’ overall β the headline health read
β’ headlineAlarms β the loudest whole-codebase risks
β’ pillars β health by axis (maintainability / safety / currency / correctness) + security
β’ focusTop β the highest-risk files (hot Γ complex Γ bug-prone)
Use it to decide where and how to start:
1. Headline alarms first, explicitly. A leaked secret β rotate & purge before
anything else. A dead/declining platform β flag and discuss; don't sink
refactor effort into a rewrite candidate. No test safety net β write
characterisation tests before you change behaviour.
2. Let the safety/maturity pillar set your method. If tests are absent or weak,
work in small verifiable steps and add coverage as you go β don't refactor blind.
3. Take the highest-leverage item from focusTop, open those files to confirm,
and propose a short plan before changing anything.
4. Treat skipped / "no data" checks as "not assessed", not "fine" β state what
you couldn't determine.
checkup tells you WHERE and HOW SAFELY to start; you read the code to decide WHAT to do.
This is a starting template β tailor it to your agent and stack. The same
bundle drives non-agentic uses too (a human reads checkup-report.md; CI/trend
consumers read the JSON). checkup is a localiser and a briefing, not a gate
(ADR-0009).
For a CTO-level read β what is this, what's the real risk posture, what's the dominant unknown, what would de-risk it β see the executive-summary recipe: a portable prompt that synthesises a run into a one-page brief. Note its privacy trade-off β feeding source to a hosted LLM breaks the deterministic scan's deny-egress property (ADR-0008); the recipe flags it and offers a report-only mode.
| Doc | What |
|---|---|
docs/architecture.md |
How it works β contract, schema, layering |
docs/build-your-own.md |
Run on a host, slim images, extract tools |
docs/tools.md |
Bundled tools, versions, verification |
docs/decisions/ |
ADRs β why it's built this way |
ROADMAP.md + Issues |
What's next (milestone v0.2.0) |
AGENTS.md Β· CONTRIBUTING.md |
Agent guidance Β· engagement model |
| Script | Purpose |
|---|---|
bin/checkup.sh |
Orchestrator. Sources lib/run-tool.sh, runs every check, emits the normalised stream. |
bin/checkup-dotnet.sh |
.NET / legacy-ASP overlay. Runs core, then appends asp-classic + devskim + dotnet-vuln + duplication. |
bin/checkup-report.sh |
Tool-agnostic markdown renderer. Reads reports/parsed/*.json β writes the report. |
lib/run-tool.sh |
Shared helpers (run_tool, write_parsed, write_skipped, write_failed, is_valid_json, slug). |
./bin/checkup.sh # runs all checks, then renders the reportThe substrate cannot run without these. Most are already on any modern dev box; listed for forker completeness.
| Tool | Why |
|---|---|
| bash (4+) | Orchestrator + helpers |
| jq | All parsed-JSON emission and the cross-tool renderer |
| git | Required by git-hotspots + the git-smells trio |
| node / npm | Every npm-script-driven check |
POSIX find, grep, sort, awk, sed |
Helpers in lib/run-tool.sh and section parsers |
Tier 2 β per-check graceful-degrade (each check skips with a documented reason if its tool is absent)
Every section in checkup.sh follows the contract: if its tool is missing
(LAST_EXIT == 127), the section emits a skip parsed JSON with a human
reason. No check is mandatory; missing tools never block the run.
| Tool | Used by | Install |
|---|---|---|
shellcheck |
shellcheck section |
apt install shellcheck / brew install shellcheck / static binary on GitHub releases |
yamllint |
yamllint section |
pipx install yamllint (recommended) / apt install yamllint |
hadolint |
hadolint section |
brew install hadolint / Linux static binary on GitHub releases (arch-mapped: x86_64/arm64) |
gitleaks |
gitleaks section |
brew install gitleaks / Linux static binary on GitHub releases (arch-mapped: x64/arm64) |
scc |
codebase-stats section |
brew install scc / Linux static binary on GitHub releases |
madge, jscpd, knip, semgrep, stryker |
various npm-script-driven sections | npm install (devDependencies; provided by the host project) |
The substrate is config-driven for the linters that can be tuned:
.shellcheckrcβ disabled rules.yamllint.ymlβ line-length, truthy keywords, comment style.hadolint.yamlβ ignored rules.gitleaks.tomlβ allowlist (paths, regexes, stopwords)
Without these, each tool runs on defaults β the substrate doesn't depend on the config files existing.
The substrate is more language-agnostic than the npm-script defaults suggest. The following all work unchanged on any stack:
- Complexity β auto-routed by the detected stack (no manual swap): ESLint
(
complexity+sonarjs/cognitive-complexity, AST-aware via typescript-eslint) on the JS/TS slice of any repo with a resolvable ESLint config β whether node-dominant or a Python/Java/Go-dominant polyglot that also carries a JS/TS slice (#73);lizard(true per-function CCN across C, C++, Java, JS, Python, Ruby, Rust, Go, Swift, Kotlin, Lua, Scala, PHP, Objective-C, β¦) on every other stack β and on the non-JS slice of a polyglot repo, merged with the ESLint slice into one record;scc's heuristic as the universal fallback (e.g. Classic ASP). All emit the same per-function CSV thegit-hotspotsjoin consumes, so churn Γ complexity works on any stack. ESLint is preferred for TS becauselizard's state-machine TS parser mis-attributes class-method CCN; when ESLint can't run, that slice is reported as unmeasured rather than silently downgraded. - Stats β
scccovers ~150 languages. - Security β
gitleaksis content-based (not language-aware);semgrephas community rulesets for most major languages. - Config-lint β
yamllint,hadolintare language-neutral. - Git-axis β
git-hotspots,change-coupling,bug-fix-density,branch-hygieneare pure git; identical on every stack. - Contract, helpers, renderer β language-agnostic by design.
The language-specific work is concentrated in the ten npm-script-driven sections. Swap those for your build system's equivalents and the rest of the substrate works as-is.
| Section | TS / Node (default) | Java / Kotlin | Python | Go | Rust | C# / .NET |
|---|---|---|---|---|---|---|
| build | npm run build |
gradle build / mvn package |
pip install -e . |
go build ./... |
cargo build |
dotnet build |
| typecheck | tsc --noEmit |
(compile-time) | mypy / pyright |
(compile-time) | (compile-time) | (compile-time) |
| test | vitest / jest | JUnit (gradle / maven) | pytest | go test ./... |
cargo test |
dotnet test |
| lint | ESLint | SpotBugs + Checkstyle + PMD | ruff | golangci-lint | clippy | built-in analyzers |
| format:check | prettier | google-java-format / ktlint | ruff format / black | gofmt -l |
cargo fmt --check |
dotnet format |
| coverage | vitest --coverage | JaCoCo | coverage.py | go test -cover (native) |
tarpaulin / llvm-cov | coverlet |
| unused | knip | (reflection-limited) | vulture / ruff F841 | go vet / deadcode |
cargo-udeps | R# CLI / IDE |
| duplication | jscpd | jscpd | jscpd | dupl / jscpd | jscpd | jscpd |
| security:audit | npm audit | OWASP dep-check / Snyk | pip-audit / safety | govulncheck | cargo-audit | dotnet list package --vulnerable |
| mutation | Stryker | PIT (Pitest) | mutmut / cosmic-ray | go-mutesting | cargo-mutants | Stryker.NET |
| circular-deps | madge | jdeps (built-in) | pydeps | go list / staticcheck |
cargo-modules | NDepend (commercial) |
The mapping is approximate β many of these tools cover different surface area
than their JS-ecosystem equivalents (e.g. golangci-lint wraps ~10 linters;
clippy is more conservative than ESLint by default). The point is the
shape is portable: parse the tool's output into the standard top[]
finding shape and the rest of the substrate carries it through unchanged.
| Variable | Used by | Default | Purpose |
|---|---|---|---|
CHECKUP_TARGET |
path resolution (checkup.sh + renderer) |
enclosing git repo, else $PWD |
Explicit project root to scan, instead of auto-detecting from the git top level. For one service in a monorepo, see Scanning a monorepo subdirectory. |
CHECKUP_MODE |
closing verdict (checkup.sh + renderer) |
tailored |
tailored (a repo you own & tune): verdict framed for your own codebase ("where to focus next"); a low score exits non-zero as a quality signal you may act on β not a deploy gate. audit (a repo you don't own / due diligence): informational only, framed as "where to invest", always exits 0. checkup never gates (ADR-0009). |
CHECKUP_SRC_ROOTS |
complexity + git-axis sections | whole tree (VCS-tracked source) | NARROWS the complexity + git-forensics scan to specific space-separated roots (e.g. app cmd). By default checkup assesses all VCS-tracked source (honest coverage); set this only to focus the scan or speed up a very large monorepo. |
CHECKUP_FORENSIC_SINCE |
git-axis sections | 6.months.ago |
git log --since window for hotspots / change-coupling / bug-fix-density. Widen (e.g. 2.years.ago) for repos with sparse recent history; an empty window degrades to skip, never a false pass. |
CHECKUP_EXCLUDE |
source inventory (all scanners + scc/identity/stats) | unset | Extra space-separated fnmatch globs excluded from the whole inventory β complexity, duplication and the scc-based stats/identity (#109) β on top of the built-in generated/vendored defaults (node_modules, migrations, snapshots, *.min.*, β¦). Also settable as a top-level exclude: list in .checkup.yml. |
CHECKUP_EXCLUDE_GENERATED |
source inventory (generated-marker pass) | on (set 0 to disable) |
Drop files carrying a banner-shaped generated marker (Go // Code generated β¦ DO NOT EDIT., comment-leader @generated, C# <auto-generated>). Default-on since v0.2.0; the dropped set is enumerated to raw/β¦generated and announced by a loud banner. CHECKUP_EXCLUDE_GENERATED=0 keeps the whole tree. |
CHECKUP_CONCENTRATION_PCT |
coverage banner (single-dir concentration) | 25 |
Threshold (% of all first-party code) at which one directory is flagged as a possible vendored tree β the banner names the directory and prints the exact CHECKUP_EXCLUDE snippet. Advisory only (never auto-excluded). |
CHECKUP_SHELL_DIRS |
shellcheck section |
scripts .husky .githooks .claude/hooks |
Space-separated dirs to search for shell scripts. Missing dirs are skipped silently. |
HADOLINT_DOCKERFILE |
hadolint section |
auto-detect Dockerfile* at root |
Override the Dockerfile filename when it is named non-conventionally. |
MUTATION_TEST |
mutation section |
unset (skipped) | Set to 1 to enable Stryker; opt-in because mutation testing is slow (~2 min). |
RAW_DIR |
every section (via run_tool) |
reports/raw |
Where each section's stdout/stderr capture is written. |
PARSED_DIR |
every section (via run_tool) |
reports/parsed |
Where each section's normalised JSON is written. |
Forks adding new env-overridable knobs should follow the same <TOOL>_<NOUN>
naming convention and document them here in one table.
Point CHECKUP_TARGET at one service inside a larger repo and scope the source
roots to it:
CHECKUP_TARGET=/path/to/monorepo/services/api \
CHECKUP_SRC_ROOTS="src" \
CHECKUP_OUT_DIR=/tmp/checkup-out \
bin/checkup.shCaveats:
- Churn / coupling / bug-fix density scope correctly β git pathspecs are cwd-relative, so the git-forensics scans see only the subtree.
- Paths are target-relative β the file-based scanners and git-forensics share
one namespace (e.g.
src/app.ts, notservices/api/src/app.ts), so the by-file hotspot aggregate joins correctly. - branch-hygiene is repo-wide β branches can't be scoped to a subtree, so its counts cover the whole monorepo, not just the service.
- One stack per run β a monorepo mixing stacks wants one run per service with the matching overlay; there's no built-in cross-service roll-up.
These commands are the default Node profile (profiles/node.sh). Each is
overridable per check β via a .checkup.yml commands: block or a
CHECKUP_CMD_<NAME> environment variable β so adapting checkup to another stack
is "set the commands", not "fork the orchestrator" (see
Overrides). On a Node repo with no overrides the
defaults below apply unchanged.
Every npm run <script> the orchestrator invokes must satisfy a small
contract. If the host project's package.json deviates, the corresponding
section may break. This table is the surface area a forker has to wire up.
| npm script | Section | Required behaviour |
|---|---|---|
typecheck |
typecheck |
Exit 0 if zero errors. Stderr/stdout should list TS errors in path:line:col form (consumed by the section parser). |
test |
unit-tests |
Exit 0 if all tests pass. Vitest summary on stdout (the parser strips ANSI then regex-matches the summary line). |
format:check |
code-quality |
Exit 0 if all files formatted. Non-zero with the list of unformatted files when drift exists. |
lint |
code-quality |
Run ESLint with the default text formatter β the section parser anchors on the β N problems summary line. |
build |
build |
Exit 0 if production build succeeds. Output captured to reports/raw/production-build.txt. |
quality:security |
semgrep |
Run Semgrep with the project ruleset. Write reports/semgrep-report.json (Semgrep's native JSON). |
quality:deps |
circular-deps |
Run madge --circular --json and write to reports/madge-circular.json (the section reads from that path). |
quality:duplicates |
duplication |
Run jscpd and write its report to reports/jscpd/jscpd-report.json. |
quality:unused |
unused-code |
Run knip and emit findings on stdout (default text format). |
test:coverage:report |
coverage |
Run vitest with coverage; coverage tooling must write coverage/coverage-summary.json (the section reads from there). |
Direct (non-npm) invocations β no npm run indirection, but listed for completeness:
| Command | Section | Notes |
|---|---|---|
npm audit --json |
npm-audit |
Native npm audit JSON |
npm outdated --json |
deps-freshness |
Native npm outdated JSON |
npx eslint --rule ... --format json |
complexity |
ESLint JSON: parsed into per-function {file, line, code: "CCN-N"/"COG-N", severity, message}. Cyclomatic findings also appended to reports/complexity-full.csv in lizard-compatible columns for the git-hotspots section. On a node-dominant polyglot repo, lizard additionally measures the non-JS slice (Python/C#/Go/β¦) and the two are merged into one record + CSV, partitioned by extension (#68). |
scc β¦ --no-cocomo |
codebase-stats |
Default text output, parsed by the section |
shellcheck -f json -x β¦ |
shellcheck |
Native JSON |
yamllint -f parsable β¦ |
yamllint |
path:line:col: [level] message (rule) format |
hadolint --no-color β¦ |
hadolint |
Text output with Dockerfile:line code: message lines |
gitleaks dir --report-format=json |
gitleaks |
Native JSON |
npx stryker run |
mutation |
Stryker writes its own HTML/JSON reports under reports/mutation/ |
If a forker doesn't have one of these wired up, the section either skips (tool
on PATH not present) or writes fail with a parse-error reason β never breaks
the whole orchestrator.
checkup auto-detects your stack (detection.json) and
picks the default commands above. A repo-local .checkup.yml at the scan
target overrides those deliberately, with no install step β the
agent-tailoring seam. It is consulted first; an absent file changes nothing
(broad, unconfigured runs are the norm for an audit of a repo you don't own).
Copy .checkup.yml.example and prune. Keys (all
optional):
stack:
force: dotnet # treat this as the primary stack (overrides detection)
suppress: [node] # treat a detected stack as absent (e.g. a tooling-only package.json)
checks:
disable: [mutation] # skip a project-built check (reports as skip: "disabled in .checkup.yml")
enable: [mutation] # opt in to an off-by-default check
commands:
test: "dotnet test" # override a check's command ("" disables it); or set CHECKUP_CMD_TEST
thresholds: # tune warn/fail banding (status only, never the score)
complexity_ccn_warn: 10 # report functions at/above this CCN (ESLint + lizard engines)
complexity_ccn_fail: 30 # any function at/above this β the complexity record fails
duplication_warn_pct: 3 # duplication % at/above this β warn; β¦_fail_pct β failIt's a small YAML subset (inline lists [a, b], # comments); yq is used if
present but is not required. Unknown keys or malformed input are warned about and
ignored β never fatal, never a false pass.
Every check in checkup.sh MUST satisfy four properties. These are the
invariants that make the script modular, reference-quality, LLM-consumable, and
graceful-degrading. They are checked by code review, not by tooling.
Adding a check is one section in checkup.sh. No edits to the markdown
writer β it discovers new parsed JSONs automatically. The skeleton:
# N. <Human Label> β <ticket>
# section: <slug>
# purpose: <one sentence β what is this measuring?>
# pass_means: <what good looks like, and WHY this threshold>
# fail_means: <what bad looks like, and the suggested response>
print_section "<Human Label>"
echo "Command: <whatever the user would run by hand>"
echo ""
<SLUG>_INTENT=$(jq -n '{
purpose: "...",
pass_means: "...",
fail_means: "..."
}')
MAX_SCORE=$((MAX_SCORE + <points>)) # keep the 168-pt trend score
run_tool "<Human Label>" <tool> [argsβ¦]
if [ "$LAST_EXIT" = "127" ]; then
write_skipped "<slug>" "<tool> not installed (install: β¦)"
elif [ ! -s "$LAST_RAW" ]; then
write_skipped "<slug>" "<tool> produced no output"
else
# β¦derive status, count, summary, top[] from $LAST_RAWβ¦
<SLUG>_TOP=$(jq -c '...' "$LAST_RAW") # see severity vocabulary below
if [ "$<count>" -eq 0 ]; then
echo -e "${GREEN}β
Passed${NC}"
HEALTH_SCORE=$((HEALTH_SCORE + <points>))
STATUS="pass"; SUMMARY="..."
elif # β¦warn caseβ¦
STATUS="warn"; β¦
else
STATUS="fail"; β¦
fi
write_parsed "<slug>" "$STATUS" "$COUNT" "$SUMMARY" "$<SLUG>_TOP" "$<SLUG>_INTENT"
fi
echo ""In practice, sections are inline (not wrapped in check_<slug>() functions) β
extracting to functions is a worthwhile future cleanup but not required by
the contract. The comment block at the top, the intent heredoc, the run_tool
call, and the write_parsed/write_skipped are the load-bearing parts.
Every check declares intent: {purpose, pass_means, fail_means} so that anyone
(human or LLM) reading the parsed JSON can understand what the check is for
without reading the source.
The comment block at the top of the check function is the source of truth. The
JSON intent field is a copy for downstream consumers. The markdown report
renders the intent under each check.
Each check writes reports/parsed/<slug>.json:
| status | meaning | summary headline |
|---|---|---|
| pass | check ran and the codebase met the threshold | β |
| warn | check ran, threshold breached but non-blocking | |
| fail | check ran, threshold breached and treated as serious | β |
| skip | check did not run (tool missing, prereq absent) | βοΈ |
Ordered by triage weight:
| severity | weight | use for |
|---|---|---|
| critical | 0 | security CVEs, secrets, RLS gaps |
| error | 0 | hard failures (typecheck errors, test failures) |
| high | 0 | high-severity findings (semgrep ERROR, CVEs) |
| warning | 1 | non-blocking but actionable |
| medium | 1 | mid-severity findings |
| low | 2 | minor issues |
| style | 2 | formatting, style |
| info | 3 | informational |
The "Top Problems" markdown aggregate uses the weight column to sort across tools.
Every check writes a parsed JSON, even when skipped. The report shows "what was supposed to run vs. what did". The convention:
if [ "$LAST_EXIT" = "127" ]; then
write_skipped "$slug" "tool-name not installed (install: β¦)"
return
fiA missing tool is never a script failure. The orchestrator continues; the final report shows the gap explicitly.
Runs <cmd>, captures stdout to $RAW_DIR/<slug>.txt, stderr to
$RAW_DIR/<slug>.stderr.txt (deleted if empty). Sets globals:
| global | meaning |
|---|---|
LAST_LABEL |
the human label passed in |
LAST_SLUG |
computed slug β used for parsed filename |
LAST_RAW |
absolute path to the stdout capture |
LAST_STDERR |
absolute path to the stderr capture (or empty) |
LAST_EXIT |
exit code of the tool β 127 if not on PATH |
Always returns 0 so that set -e callers do not abort on expected non-zero
exits (e.g. lint with warnings, typecheck with errors). The tool's real exit
code is in $LAST_EXIT; 127 means the tool is not on $PATH.
Emits the parsed JSON. top-json and intent-json are passed verbatim into
jq --argjson so they must be valid JSON; defaults are [] and {}.
Sugar for write_parsed <slug> skip 0 "<reason>" [] <intent> β always-write
discipline for the "deliberately not run" case (tool not installed, opt-in
not set, prereq genuinely absent). Pass the check's intent heredoc so the
report still explains why this check matters even when it didn't run; a
reader can then decide whether to enable it.
For the "tool ran but we can't interpret the output" case β missing/
malformed report file, non-zero exit without parseable diagnostic. Status
is fail with empty top[]. Use this in preference to write_skipped
when the tool actually executed; the distinction matters because skipped
implies "deliberately bypassed" whereas this case is "we tried and got
nothing usable, can't claim safety". The wrong choice silently downgrades
a real failure.
Returns 0 if the file exists, is non-empty, and parses as JSON. Use as a guard
before jq against tool output you didn't construct yourself.
Lowercase kebab-case derivation. Stable identifier for filenames and the
slug field. slug "Code Quality (Formatting + Linting)" β code-quality-formatting-linting.
reports/ # gitignored (whole tree)
βββ raw/ # one file per check β tool stdout/stderr
β βββ complexity-hotspots.txt # ESLint JSON output (one finding per line)
β βββ eslint.txt
β βββ eslint.stderr.txt # only present when stderr was non-empty
β βββ β¦
βββ parsed/ # one file per check β normalised JSON
β βββ complexity.json
β βββ eslint.json
β βββ β¦
βββ focus.json # derived "where to focus" ranking β files
β # ranked by how many health axes they land
β # on (hotspot / coupling / bug-fix density /
β # complexity), each with a per-axis "why".
βββ by-file.json # derived cross-cut β files ranked by
β # severity-weighted finding count across
β # every check. Combined with the
β # `git-hotspots` check this completes
β # the Tornhill bug-hotspot triangle.
βββ checkup-report-<utc-ts>.md # timestamped history (kept for trend)
βββ checkup-summary.json # score + max for the trend headline
βββ complexity-full.csv # complexity findings in lizard-CSV format
# (consumed by git-hotspots; written by the
# complexity section)
docs/reports/ # committed
βββ checkup-report.md # always = "latest", overwritten on each run
The markdown writer computes three cross-tool views automatically β no check contributes to them directly; they emerge from the standardised parsed JSON.
The report's headline "where should this team focus first?" view. Fuses the
four per-file health axes β git-hotspots (churn Γ complexity),
change-coupling, bug-fix-density, and complexity β by file, so a file
landing on several axes (hot Γ complex and coupled and bug-dense)
rises to the top. Ranking is axis-count first (multi-signal concentration is
the point), then a severity-weighted focus score; each row carries a one-line
why (the strongest message per axis). Renderer-only, so it works on any
stack whose run produced those checks, and is simply empty when none did. It
is a focus signal, never a gate. Full ranking β reports/focus.json; the
report shows the top 10.
A single triage list across every check. Max 3 entries per tool (so a wide
check like code-quality with 472 warnings can't drown everything else),
total cap 30. Severity-weighted: critical/error/high β 0, warning/medium β 1, low/style β 2, info β 3. Sort ascending by weight.
Joins every check's top[] by file. The ranking is severity-weighted
(critical=4 β¦ info=1) so a file with one critical finding outranks one
with three info-only findings. A file appearing across multiple checks
(e.g. complexity + lint + semgrep) is a likely bug hotspot β the spatial
axis of the Tornhill triangle. The temporal axis lives in the dedicated
git-hotspots check, which joins six-month commit churn against per-file
max CCN. Together they cover both legs of the bug-prediction signal.
The full ranking goes to reports/by-file.json for LLM consumption; the
markdown report shows the top 10.
checkup is built to be forked and adapted. The full fork-and-modify guide β what to keep verbatim, what to swap, a worked example, "how to tell you're done", and guidance for AI agents driving a port β lives in docs/build-your-own.md.
Why the dual stream, per-check JSON, status-vs-score, and in-band intent: see docs/architecture.md. The deeper decisions (pinning, layering, contribution model) are recorded as ADRs in docs/decisions/.
{ "slug": "complexity", "status": "warn", // pass | warn | fail | skip "count": 77, "summary": "77 hotspots over CCN 10 / cognitive 15 (top 20 reported)", "top": [ // optional β most checks populate { "file": "src/services/example-service.ts", "line": 347, "code": "CCN-37", // short tag for grouping/dedup "severity": "warning", // see vocabulary below "message": "handleRequest β CCN 37", }, ], "intent": { "purpose": "Identify functions whose complexity makes them bug-incubators.", "pass_means": "No functions over CCN 10 or cognitive 15.", "fail_means": "CCN/cognitive > 30 should be refactored or covered with dedicated tests.", }, }