Skip to content
@AARR-bench

AARR-bench

AARR-bench(Act As a Real Researcher)

Evaluating the ability of LLM Agents to conduct research: the core focus is — what exactly are the gaps between AI Agents and real human researchers?

Roadmap

  • AARRI-bench(Act As a Real Research Intern)(ongoing)
  • AARRA-bench(Act As a Real Research Assistant)(to be continued)
  • AARRS-bench(Act As a Real Research Scientist)(to be continued)

Popular repositories Loading

  1. .github .github Public

Repositories

Showing 1 of 1 repositories

Top languages

Loading…

Most used topics

Loading…