Add AIME2025, GPQA, HealthBench evaluation_test suites; unify row-limiting via pytest flag; clean up examples #193
ci.yml
on: pull_request
Annotations
1 warning
|
MCP End-to-End Tests
No files were found with the provided path: coverage.xml. No artifacts will be uploaded.
|
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
coverage-batch-eval
Expired
|
30.6 KB |
sha256:610cf815998140497fbc452fc13fb47315c7f63b05f6c8f6b3d3f979016fa84e
|
|
|
coverage-core-3.10
Expired
|
36.3 KB |
sha256:bfde16a1ff5de670fba298baeee75b1ec51a83588898dd7d50d552c4df9d282f
|
|
|
coverage-core-3.11
Expired
|
36.3 KB |
sha256:a5befdc6ecda3d54aba9cae97f101ab34a6489445feebee58a46c75d3be7adad
|
|
|
coverage-core-3.12
Expired
|
36.3 KB |
sha256:feea972f201d325b45252e0de687a0fa10762f0ca4b03eacea7ce0da78150457
|
|