Know what your AI wrote. Track AI-generated code provenance in git repos.
ghostwrite analyzes your git history and tells you exactly how much of your codebase was written by AI tools — which tool, which author, which files, and whether that code survived. It works with any git repo, requires zero configuration, and runs entirely offline.
$ ghostwrite scan && ghostwrite report
ghostwrite — Know what your AI wrote.
╭────────────────────────────────────────────────────────────╮
│ Repository: my-project │
│ Period: 2025-01-01 → 2026-03-22 │
│ Commits: 847 total │ 312 AI-assisted (36.8%) │
╰────────────────────────────────────────────────────────────╯
📊 AI Code Breakdown
Tool Commits +Added Confidence
───────── ─────── ─────── ──────────────
claude 198 +41,203 ● 94% confirmed
cursor 89 +18,552 ● 72% confirmed
copilot 25 +4,109 ◐ 60% heuristic
🧬 AI Code Survival 312 / 312 AI commits reachable from HEAD (100%)
⚡ Code Churn AI: 8.3% Human: 12.1%
- 8 detection strategies — co-author emails, git trailers,
[ai:tool]tags, cursor-style commit messages, structured commit bodies, file-count spikes, labels, and session markers - 9 tools tracked — Claude, Cursor, GitHub Copilot, Codex, Gemini, Aider, Windsurf, Devin, Augment
- Survival analysis — how much AI code is still reachable from HEAD vs churned
- Churn analysis — compare rewrite rates between AI and human code
- CI/CD mode — fail builds when AI% exceeds a threshold; outputs JSON or SARIF
- Git hooks — auto-tag AI commits at commit time via
commit-msghook - Zero network calls — all analysis is local; no telemetry, no API keys
- Fast — parallel scanning with SQLite cache; incremental rescans skip already-analyzed commits
brew install puvaan/tap/ghostwritego install github.com/puvaan/ghostwrite@latestGrab the latest release for your platform from Releases.
git clone https://github.com/puvaan/ghostwrite
cd ghostwrite
make build # output: bin/ghostwrite# 1. Initialize (creates .ghostwrite.yml, updates .gitignore)
ghostwrite init
# 2. Scan git history (last 6 months by default)
ghostwrite scan
# 3. View report
ghostwrite report
# 4. (Optional) Install git hooks to auto-tag future commits
ghostwrite hook install| Command | Description |
|---|---|
ghostwrite init |
Initialize config and directories |
ghostwrite scan |
Analyze git history and cache results |
ghostwrite report |
Display AI provenance report |
ghostwrite ci |
CI mode — threshold checks, JSON/SARIF output |
ghostwrite diff |
Compare AI vs human code metrics side-by-side |
ghostwrite hook install |
Install commit-msg and pre-push git hooks |
ghostwrite hook uninstall |
Remove git hooks |
ghostwrite hook status |
Show installed hook status |
ghostwrite config show |
Print current configuration |
ghostwrite config reset |
Reset config to defaults |
ghostwrite version |
Print version info |
ghostwrite scan --since 6m # last 6 months (default)
ghostwrite scan --since 2024-01-01 # since a specific date
ghostwrite scan --since 1y # last year
ghostwrite scan --force # re-analyze all commits (ignore cache)
ghostwrite scan --branch main # specific branch
ghostwrite scan --all # all branchesghostwrite report # terminal output (default)
ghostwrite report --format json # JSON output
ghostwrite report --format markdown # Markdown output
ghostwrite report --output report.md # write to file
ghostwrite report --section summary # single section only
ghostwrite report --since 30d # filter by date range
ghostwrite report --author alice@acme.com # filter by author# Warn if AI% exceeds 80% (exit 0)
ghostwrite ci --threshold 80
# Fail if AI% exceeds 80% (exit 1)
ghostwrite ci --threshold 80 --fail
# Output SARIF for GitHub Code Scanning
ghostwrite ci --format sarif --output results.sarif
# Compare against a baseline report
ghostwrite ci --threshold 80 --baseline previous-report.jsonghostwrite diff # last 30 days
ghostwrite diff --since 3m # last 3 monthsRunning ghostwrite init creates .ghostwrite.yml:
detection:
tools:
- name: claude
enabled: true
- name: cursor
enabled: true
# ... more tools
report:
top_n: 10 # max items per section
since: 6m # default scan window
cache:
path: .ghostwrite/cache.dbghostwrite uses multiple strategies to identify AI-generated commits, each producing either a confirmed or heuristic confidence level:
| Strategy | How it works | Confidence |
|---|---|---|
| CoAuthor | Matches Co-Authored-By: lines against known AI tool emails |
confirmed |
| Trailer | Matches AI-Tool:, AI-Agent:, Generated-By: git trailers |
confirmed |
| Tag | Matches [ai:toolname] in commit subject |
confirmed |
| CursorStyle | Lowercase, no conventional prefix, imperative verb, 30–100 chars | confirmed* |
| ConventionalRich | Conventional prefix + 4+ bullet body OR structured metadata | heuristic |
| FilesMultiplier | Files changed > 1.8× author's personal average | heuristic |
* confirmed when ~/.cursor/ exists locally, otherwise heuristic.
- name: Check AI provenance
run: |
ghostwrite scan --since 1y --force
ghostwrite ci --threshold 90 --fail --format sarif --output ghostwrite.sarif
- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: ghostwrite.sarifghostwrite:
script:
- ghostwrite scan --since 1y --force
- ghostwrite ci --threshold 90 --fail
artifacts:
reports:
sast: ghostwrite.sarifghostwrite tracks itself. The CI pipeline runs ghostwrite on its own git history on every push to main and every Monday morning. The JSON report is uploaded as a build artifact.
MIT — see LICENSE.