Live, open-source benchmark for comparing AI coding agents on real GitHub issues
-
Updated
Jun 16, 2026 - Python
Live, open-source benchmark for comparing AI coding agents on real GitHub issues
Turn any Git repository into a local SWE-bench-style coding-agent benchmark.
A curated list of benchmarks, harnesses, leaderboards, and tools for evaluating AI coding agents.
Add a description, image, and links to the coding-agent-benchmark topic page so that developers can more easily learn about it.
To associate your repository with the coding-agent-benchmark topic, visit your repo's landing page and select "manage topics."