Close Harbor integration gaps: verifier isolation, artifact collection#33
Open
mhrezaei1 wants to merge 1 commit intoaisa-group:add_harbor_supportfrom
Open
Close Harbor integration gaps: verifier isolation, artifact collection#33mhrezaei1 wants to merge 1 commit intoaisa-group:add_harbor_supportfrom
mhrezaei1 wants to merge 1 commit intoaisa-group:add_harbor_supportfrom
Conversation
- Embed SHA256 of evaluate.py into test.sh at task-generation time so the verifier can detect if the agent tampered with the eval script (reward hacking mitigation); score is set to 0 on mismatch - Add artifact collection to test.sh: workspace files (minus large model weights) are copied to /logs/artifacts/workspace/ so Harbor auto-collects them after each trial - Generate a job.yaml alongside each task with a commented-out artifacts block for optionally downloading the full workspace including model weights - Document verifier isolation and artifact collection in README
Collaborator
|
Thanks for the PR! Were you able to verify if this worked, by running an actual task on with Harbor (such as with the Modal backend)? We would need to verify if the actual workflow works or not. The task generation is not sufficient
The sentinel-file approach doesn't work, that's why we way for a pre-agent hook. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes the three open gaps noted in #8.
Changes
Verifier isolation (reward hacking mitigation)
adapter.pynow computes the SHA256 ofevaluate.pyat task-generation time and injects it intotest.sh. If the agent modifiesevaluate.pyto manipulate its score, the verifier detects the hash mismatch and outputs a reward of 0.The
tests/directory is copied by Harbor separately from the agent's workspace, so the agent cannot altertest.shitself.Artifact collection
Two levels now work out of the box:
test.shcopies the workspace to/logs/artifacts/workspace/at the end of verification, excluding large model weight files (*.safetensors,*.bin,*.pt,*.pth,*.ckpt). Harbor auto-collects/logs/artifacts/with no extra config.job.yamlwith a commented-outartifactsblock. Uncomment to also download model weights:harbor run -c <task_dir>/job.yamlPre-agent hooks / timer
Already handled by the sentinel-file approach in
timer.sh— no changes needed.Testing
evaluate.pyfor each generated taskevaluate.py)