Add AIME2025, GPQA, HealthBench evaluation_test suites; unify row-limiting via pytest flag; clean up examples #195
ci.yml
on: pull_request
Annotations
2 errors and 1 warning
|
Core Tests (Python 3.12)
Process completed with exit code 1.
|
|
Core Tests (Python 3.11)
Process completed with exit code 1.
|
|
MCP End-to-End Tests
No files were found with the provided path: coverage.xml. No artifacts will be uploaded.
|
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
coverage-batch-eval
Expired
|
30.7 KB |
sha256:01a7b53b055a83d8d07a986e256c7c78fba6565adbbe26e429ce03a075e7a818
|
|
|
coverage-core-3.10
Expired
|
36.6 KB |
sha256:ac08a00538707773367a4b814506628f3b26954950b4c0d5aedded3daf34d1ba
|
|