ci-rootcause

Deterministic CI root-cause analysis for failed CI runs.

Proven Results (MVP Suite)

From the curated MVP benchmark (13 cases):

Classification accuracy: 100% (13/13)
Baseline classification accuracy: 69.23% (9/13)
Improvement: +30.77 percentage points (about 44.4% relative lift vs baseline)
Top-1 root-cause accuracy: 100% (12/12 applicable cases)
Agentic proposal validity: 100% (6/6 exercised cases)
Guarded validation pass rate: 50% (3/6, three good fixes passed and three bad fixes were blocked)
Artifact hash reproducibility: 100%
Confidence reproducibility: 100%

Benchmark source:

Reproduce locally: python scripts/run_benchmark.py --suite fixtures/benchmarks/mvp-suite.json --output-root artifacts/benchmark-mvp --report-json docs/reports/mvp-benchmark-report.json --report-md docs/reports/mvp-benchmark-report.md
Compare local Ollama against the fixture-backed benchmark: python scripts/run_ollama_comparison.py --suite fixtures/benchmarks/mvp-suite.json --llm-model qwen2.5-coder:7b --report-json artifacts/ollama-comparison/latest.json
docs/reports/mvp-benchmark-report.md
docs/reports/mvp-benchmark-report.json
docs/limitations.md

App-First Quickstart (No YAML)

Primary path (recommended):

Install the GitHub App for your target repository.
Configure app runtime with safe defaults:
- enabled=true
- post_comment=true
- enable_pr_mode=false
- create_fix_pr=false
Trigger a failed workflow_run and verify:
- RCA comment appears on PR/commit context
- ci-rca.json and ci-rca.md paths are returned
- Outcome status/reason codes are machine-readable

Setup references:

Reference artifact examples:

Agentic Modes (Optional)

Recommended default for new users: deterministic.

Mode	Autonomy	Key requirement	Cost profile	Risk profile
`deterministic`	Rule-based only	None	Lowest	Lowest
`agentic_assist`	LLM proposes candidate fix steps, deterministic pipeline validates/falls back	Hosted providers require API key; `local` does not	Medium	Low-medium
`agentic_full`	Highest autonomy path (explicit opt-in gate required)	Hosted providers require API key; `local` does not	Highest	Highest

Provider support:

Hosted: openai, gemini, anthropic (require provider_api_key in agentic modes).
Local: local (Ollama endpoint compatible, no paid vendor API key required).

Action secret examples:

provider: openai + provider_api_key: ${{ secrets.OPENAI_API_KEY }}
provider: gemini + provider_api_key: ${{ secrets.GEMINI_API_KEY }}
provider: anthropic + provider_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
provider: local + no provider_api_key

Purpose

ci-rootcause analyzes CI failures and produces:

Structured failure graph
Deterministic root-cause ranking
Deterministic confidence score
Evidence-backed fix plan
Deterministic patch plan operations (modify/create/delete/rename)
Optional guarded fix PR (never auto-merged)
ci-rca.json and ci-rca.md artifacts
ci-rca-observability.json run telemetry artifact (trace/timing/failure taxonomy)

Primary runtime target is GitHub Actions. Provider adapter defaults support GitHub Actions and GitLab CI metadata resolution.

When To Use ci-rootcause

Ideal use cases:

CI failed and you need deterministic root-cause ranking with evidence, not just a generic summary.
You want machine-readable RCA artifacts (ci-rca.json) for automation/reporting.
You want safe, guardrailed fix PR proposals with explicit confidence thresholds.
You need consistent behavior across repeated runs on the same inputs.

Not a fit (non-goals):

Running arbitrary autonomous repo-wide refactors.
Replacing your normal test/lint/build workflows.
Auto-merging remediation changes without human review.

Comparison with formatter-only autofix workflows:

Capability	ci-rootcause	Formatter/Linter autofix flow
Works from failed CI logs + diff	Yes	Usually no
Root-cause classification	Yes	No
Ranked RCA with confidence	Yes	No
Structured RCA artifact (`ci-rca.json`)	Yes	No
Guardrailed optional fix PRs	Yes	Yes (tool-dependent)
Designed for deterministic replay	Yes	Varies

Architecture Overview

flowchart LR
  A[CI Logs + Diff] --> B[Log Ingest Agent]
  A --> C[Diff Analysis Agent]
  B --> D[Failure Classification Agent]
  C --> E[Root Cause Ranker Agent]
  D --> E
  E --> F[Fix Planner Agent]
  E --> G[Reporter Agent]
  F --> H[PR Creation Agent]
  G --> I[Artifacts ci-rca.json + ci-rca.md]
  H --> J[Guarded Fix PR]

Local Setup

Requirements:

Python 3.11+

Install tools:

python -m pip install --upgrade pip
pip install -r requirements.txt
pre-commit install

Run checks:

ruff check .
ruff format --check .
pytest

CLI Quickstart

Install dependencies:

pip install -r requirements.txt

Run the local pipeline once:

ci-rootcause \
  --log-path fixtures/ci-logs/github-actions-python-failure.log \
  --diff-path fixtures/diffs/refactor-only.diff \
  --output-dir artifacts \
  --timestamp 2026-02-21T00:00:00Z \
  --commit abc123 \
  --run-id gha_quickstart_1 \
  --base-commit abc122 \
  --head-commit abc123 \
  --repository owner/repo

Inspect generated artifacts:

artifacts/ci-rca.json
artifacts/ci-rca.md

Demo Script

Run three reproducible demo scenarios:

for case in \
  fixtures/demos/01-dependency-lockfile-drift \
  fixtures/demos/02-typecheck-ts2345 \
  fixtures/demos/03-infra-timeout
do
  name="$(basename "$case")"
  ci-rootcause \
    --log-path "$case/ci.log" \
    --diff-path "$case/change.diff" \
    --output-dir "artifacts/demo/$name" \
    --timestamp 2026-02-21T00:00:00Z \
    --commit abc123 \
    --run-id "demo_${name}" \
    --base-commit abc122 \
    --head-commit abc123 \
    --repository owner/repo
done

Demo fixture pack:

fixtures/demos/README.md
fixtures/demos/01-dependency-lockfile-drift
fixtures/demos/02-typecheck-ts2345
fixtures/demos/03-infra-timeout

Local CLI Execution

Run end-to-end deterministic analysis locally:

ci-rootcause \
  --log-path fixtures/ci-logs/github-actions-python-failure.log \
  --diff-path fixtures/diffs/refactor-only.diff \
  --historical-runs-path fixtures/classification/historical-runs.sample.json \
  --output-dir artifacts \
  --timestamp 2026-02-20T00:00:00Z \
  --commit abc123 \
  --run-id gha_local_1 \
  --base-commit abc122 \
  --head-commit abc123 \
  --repository owner/repo

CLI behavior:

Writes ci-rca.json and ci-rca.md into --output-dir
Prints a machine-readable JSON summary to stdout
Exits 0 for completed/partial analysis runs, 2 for runtime/input errors
Supports optional deterministic flaky-test detection via --historical-runs-path
Supports local --config-path (simple key: value) and single-stream stdin input via -
Supports --offline-only to force no remote PR creation/network calls
Supports rollout profile --profile safe-github-rollout (enforces min PR confidence >= 0.90)

Runtime mode:

Uses Google ADK runtime orchestration by default when google-adk is installed
Falls back to deterministic local orchestration if ADK runtime initialization fails
Uses deterministic local orchestration when --fail-fast is enabled

Architecture Details

Execution order is deterministic and fixed:

log_ingest
diff_analysis
failure_classification
root_cause_ranker
fix_planner
reporter
pr_creation

Runtime behavior:

ADK runtime is used by default when available.
Deterministic local fallback executes on ADK initialization/runtime failure.
fail_fast uses deterministic local orchestration to preserve exception behavior.

Live GitHub Integration Test (Opt-in)

Live PR creation/idempotency validation is available as an opt-in integration test:

scripts/run_live_github_test.sh \
  --repo-path /path/to/disposable/repo \
  --repository owner/repo \
  --token ghp_xxx \
  --target-branch main

Notes:

Test is skipped unless CI_ROOTCAUSE_LIVE_GITHUB=1.
Use a disposable repository with push + PR permissions.
Script prints a cleanup checklist after the test run.

MVP Metrics And Release Artifacts

Benchmark report JSON: docs/reports/mvp-benchmark-report.json
Benchmark report summary: docs/reports/mvp-benchmark-report.md
Release checklist: docs/release-checklist-v0.1.1.md
Agentic release plan + thresholds: docs/agentic-release-plan.md
Benchmark metrics include classification/primary RCA accuracy, confidence reproducibility, artifact-hash reproducibility, timing distribution (mean/median/p95), and deterministic lift against basic-log-summarizer-v1 baseline classification accuracy.
Release notes: docs/release-notes-v0.1.0.md
Known limitations: docs/limitations.md

Known Limitations And Non-Goals

Current curated benchmark corpus is intentionally small (MVP scope).
Classification coverage is deterministic-rule based and pattern limited.
Timing metrics are runtime-derived and marked as nondeterministic metadata.
Automated fix generation is guardrailed and intentionally conservative.
No automatic merge or branch-protection bypass is supported.
No CI rerun orchestration is included in MVP.

Contributing

Contribution standards are documented in CONTRIBUTING.md.

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.github		.github
agents/generated		agents/generated
artifacts/benchmark-mvp		artifacts/benchmark-mvp
config		config
docs		docs
evals		evals
fixtures		fixtures
scripts		scripts
src		src
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
action.yml		action.yml
ci-rca-observability.json		ci-rca-observability.json
ci-rca.json		ci-rca.json
ci-rca.md		ci-rca.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ci-rootcause

Proven Results (MVP Suite)

App-First Quickstart (No YAML)

Agentic Modes (Optional)

Purpose

When To Use ci-rootcause

Architecture Overview

Local Setup

CLI Quickstart

Demo Script

Local CLI Execution

Architecture Details

Live GitHub Integration Test (Opt-in)

MVP Metrics And Release Artifacts

Known Limitations And Non-Goals

Contributing

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ci-rootcause

Proven Results (MVP Suite)

App-First Quickstart (No YAML)

Agentic Modes (Optional)

Purpose

When To Use ci-rootcause

Architecture Overview

Local Setup

CLI Quickstart

Demo Script

Local CLI Execution

Architecture Details

Live GitHub Integration Test (Opt-in)

MVP Metrics And Release Artifacts

Known Limitations And Non-Goals

Contributing

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages