fix(check): stop check agents from getting lost outside their sandbox#152
Open
RobbieMcKinstry wants to merge 1 commit into
Open
fix(check): stop check agents from getting lost outside their sandbox#152RobbieMcKinstry wants to merge 1 commit into
RobbieMcKinstry wants to merge 1 commit into
Conversation
A traced `multi check` run showed every "agent did not report" timeout sharing one signature: the agent was never told the *path* of its sandbox working directory, went hunting for the repository across the host filesystem (navigating by the workspace map in the loaded user-level CLAUDE.md), and died inside an unbounded recursive Glob over a huge host tree. Retries at effort=low (temperature 0.0) replayed the same fatal trajectory verbatim, three out of three attempts. Two other agents "passed" by grading the live repository instead of the sandbox copy. Three fixes, one per leg of that failure: - Instructions now state the sandbox path explicitly, direct the agent to stay inside it, and note that omitting `path` defaults to it. Every traced agent that learned the path — by any means — reported within 40s. - New `Jailed` tool decorator confines FileRead/Grep/Glob to the agent's working directory: paths are judged by their symlink-resolved location (via deepest-existing-ancestor, so /var <-> /private/var aliasing and not-yet-existing paths both work), and glob patterns get their own rule since an absolute pattern replaces the walk base and `..` climbs out of it. Escapes return a corrective tool error instead of running away. - `AgentRunRequest.attempt` (1-based) threads the retry count to the executor: retries raise the sampling temperature by 0.5 (capped at 1.0) and append a note that a previous attempt went unreported, so attempt 2 has a reason to sample a different trajectory than attempt 1. Also formats three files (trace.rs, checks/mod.rs, trace_archive.rs) that were committed unformatted — `cargo make check-format` was failing at HEAD. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01VxKv1hhPZ4GocfmwHUk1G8
This was referenced Jul 2, 2026
Contributor
Author
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
This was referenced Jul 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

A traced
multi checkrun showed every "agent did not report" timeoutsharing one signature: the agent was never told the path of its sandbox
working directory, went hunting for the repository across the host
filesystem (navigating by the workspace map in the loaded user-level
CLAUDE.md), and died inside an unbounded recursive Glob over a huge host
tree. Retries at effort=low (temperature 0.0) replayed the same fatal
trajectory verbatim, three out of three attempts. Two other agents
"passed" by grading the live repository instead of the sandbox copy.
Three fixes, one per leg of that failure:
stay inside it, and note that omitting
pathdefaults to it. Everytraced agent that learned the path — by any means — reported within 40s.
Jailedtool decorator confines FileRead/Grep/Glob to the agent'sworking directory: paths are judged by their symlink-resolved location
(via deepest-existing-ancestor, so /var <-> /private/var aliasing and
not-yet-existing paths both work), and glob patterns get their own rule
since an absolute pattern replaces the walk base and
..climbs out ofit. Escapes return a corrective tool error instead of running away.
AgentRunRequest.attempt(1-based) threads the retry count to theexecutor: retries raise the sampling temperature by 0.5 (capped at 1.0)
and append a note that a previous attempt went unreported, so attempt 2
has a reason to sample a different trajectory than attempt 1.
Also formats three files (trace.rs, checks/mod.rs, trace_archive.rs) that
were committed unformatted —
cargo make check-formatwas failing at HEAD.Co-Authored-By: Claude Fable 5 noreply@anthropic.com
Claude-Session: https://claude.ai/code/session_01VxKv1hhPZ4GocfmwHUk1G8