Skip to content

fix(check): stop check agents from getting lost outside their sandbox#152

Open
RobbieMcKinstry wants to merge 1 commit into
fix-multi-check-hang-on-exitfrom
fix-check-agent-timeouts
Open

fix(check): stop check agents from getting lost outside their sandbox#152
RobbieMcKinstry wants to merge 1 commit into
fix-multi-check-hang-on-exitfrom
fix-check-agent-timeouts

Conversation

@RobbieMcKinstry

Copy link
Copy Markdown
Contributor

A traced multi check run showed every "agent did not report" timeout
sharing one signature: the agent was never told the path of its sandbox
working directory, went hunting for the repository across the host
filesystem (navigating by the workspace map in the loaded user-level
CLAUDE.md), and died inside an unbounded recursive Glob over a huge host
tree. Retries at effort=low (temperature 0.0) replayed the same fatal
trajectory verbatim, three out of three attempts. Two other agents
"passed" by grading the live repository instead of the sandbox copy.

Three fixes, one per leg of that failure:

  • Instructions now state the sandbox path explicitly, direct the agent to
    stay inside it, and note that omitting path defaults to it. Every
    traced agent that learned the path — by any means — reported within 40s.
  • New Jailed tool decorator confines FileRead/Grep/Glob to the agent's
    working directory: paths are judged by their symlink-resolved location
    (via deepest-existing-ancestor, so /var <-> /private/var aliasing and
    not-yet-existing paths both work), and glob patterns get their own rule
    since an absolute pattern replaces the walk base and .. climbs out of
    it. Escapes return a corrective tool error instead of running away.
  • AgentRunRequest.attempt (1-based) threads the retry count to the
    executor: retries raise the sampling temperature by 0.5 (capped at 1.0)
    and append a note that a previous attempt went unreported, so attempt 2
    has a reason to sample a different trajectory than attempt 1.

Also formats three files (trace.rs, checks/mod.rs, trace_archive.rs) that
were committed unformatted — cargo make check-format was failing at HEAD.

Co-Authored-By: Claude Fable 5 noreply@anthropic.com
Claude-Session: https://claude.ai/code/session_01VxKv1hhPZ4GocfmwHUk1G8

A traced `multi check` run showed every "agent did not report" timeout
sharing one signature: the agent was never told the *path* of its sandbox
working directory, went hunting for the repository across the host
filesystem (navigating by the workspace map in the loaded user-level
CLAUDE.md), and died inside an unbounded recursive Glob over a huge host
tree. Retries at effort=low (temperature 0.0) replayed the same fatal
trajectory verbatim, three out of three attempts. Two other agents
"passed" by grading the live repository instead of the sandbox copy.

Three fixes, one per leg of that failure:

- Instructions now state the sandbox path explicitly, direct the agent to
  stay inside it, and note that omitting `path` defaults to it. Every
  traced agent that learned the path — by any means — reported within 40s.
- New `Jailed` tool decorator confines FileRead/Grep/Glob to the agent's
  working directory: paths are judged by their symlink-resolved location
  (via deepest-existing-ancestor, so /var <-> /private/var aliasing and
  not-yet-existing paths both work), and glob patterns get their own rule
  since an absolute pattern replaces the walk base and `..` climbs out of
  it. Escapes return a corrective tool error instead of running away.
- `AgentRunRequest.attempt` (1-based) threads the retry count to the
  executor: retries raise the sampling temperature by 0.5 (capped at 1.0)
  and append a note that a previous attempt went unreported, so attempt 2
  has a reason to sample a different trajectory than attempt 1.

Also formats three files (trace.rs, checks/mod.rs, trace_archive.rs) that
were committed unformatted — `cargo make check-format` was failing at HEAD.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01VxKv1hhPZ4GocfmwHUk1G8

RobbieMcKinstry commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant