Turn any Git repository into a local SWE-bench-style coding-agent benchmark.
python git testing benchmark evaluation developer-tools flagship ai-agents local-first hidden-tests llm coding-agents agent-evaluation reproducible-ai swe-bench coding-agent-benchmark release-track local-benchmark
-
Updated
May 28, 2026 - Python