Skip to content

Close Harbor integration gaps: verifier isolation, artifact collection#33

Open
mhrezaei1 wants to merge 1 commit intoaisa-group:add_harbor_supportfrom
mhrezaei1:harbor-integration
Open

Close Harbor integration gaps: verifier isolation, artifact collection#33
mhrezaei1 wants to merge 1 commit intoaisa-group:add_harbor_supportfrom
mhrezaei1:harbor-integration

Conversation

@mhrezaei1
Copy link
Copy Markdown

Closes the three open gaps noted in #8.

Changes

Verifier isolation (reward hacking mitigation)

adapter.py now computes the SHA256 of evaluate.py at task-generation time and injects it into test.sh. If the agent modifies evaluate.py to manipulate its score, the verifier detects the hash mismatch and outputs a reward of 0.

The tests/ directory is copied by Harbor separately from the agent's workspace, so the agent cannot alter test.sh itself.

Artifact collection

Two levels now work out of the box:

  • Automatic: test.sh copies the workspace to /logs/artifacts/workspace/ at the end of verification, excluding large model weight files (*.safetensors, *.bin, *.pt, *.pth, *.ckpt). Harbor auto-collects /logs/artifacts/ with no extra config.
  • Full workspace (opt-in): Each generated task now includes a job.yaml with a commented-out artifacts block. Uncomment to also download model weights:
    artifacts:
      - source: /home/agent/workspace
        destination: full-workspace
    Tasks can then be run as: harbor run -c <task_dir>/job.yaml

Pre-agent hooks / timer

Already handled by the sentinel-file approach in timer.sh — no changes needed.

Testing

  • All 28 task combinations generate without errors
  • SHA256 hashes are verified to match evaluate.py for each generated task
  • Hashes differ correctly across benchmarks (each has its own evaluate.py)

- Embed SHA256 of evaluate.py into test.sh at task-generation time so
  the verifier can detect if the agent tampered with the eval script
  (reward hacking mitigation); score is set to 0 on mismatch

- Add artifact collection to test.sh: workspace files (minus large model
  weights) are copied to /logs/artifacts/workspace/ so Harbor auto-collects
  them after each trial

- Generate a job.yaml alongside each task with a commented-out artifacts
  block for optionally downloading the full workspace including model weights

- Document verifier isolation and artifact collection in README
@hrdkbhatnagar
Copy link
Copy Markdown
Collaborator

Thanks for the PR!

Were you able to verify if this worked, by running an actual task on with Harbor (such as with the Modal backend)? We would need to verify if the actual workflow works or not. The task generation is not sufficient

Already handled by the sentinel-file approach in timer.sh — no changes needed.

The sentinel-file approach doesn't work, that's why we way for a pre-agent hook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants