fix(#2053): prevent corrupted session history from breaking Anthropic API calls by aheritier · Pull Request #2058 · docker/docker-agent

aheritier · 2026-03-11T13:03:29Z

Summary

Fixes #2053 — After restoring a session that involved transfer_task (sub-agent delegation), all subsequent prompts fail with:

unexpected tool_use_id found in tool_result blocks: tooluse_PDwaDa9qmwIyg9WUMFXlez.
Each tool_result block must have a corresponding tool_use block in the previous message.

Root Cause

PersistentRuntime.handleEvent() persists all streaming events to the parent session, including events from sub-agent executions during task transfers. The existing sess.IsSubSession() guard never triggers because sess is always the parent session. This corrupts the parent session's message history with interleaved sub-agent messages containing orphan tool_use/tool_result blocks.

Changes

1. `fix(#2053): filter sub-agent streaming events from parent session persistence`

Root cause fix. Adds SessionID field to AgentChoiceEvent and AgentChoiceReasoningEvent. All streaming and message events in handleEvent are now filtered by comparing the event's SessionID against the parent session's ID. Sub-agent events are silently skipped (persisted separately via SubSessionCompletedEvent).

Files: pkg/runtime/event.go, pkg/runtime/streaming.go, pkg/runtime/persistent_runtime.go

2. `fix(#2053): add pendingAssistantToolUse guard to beta message converter`

Defense layer. The non-beta convertMessages already had a pendingAssistantToolUse flag that drops orphan tool results. The beta convertBetaMessages lacked this guard — every tool role message was unconditionally sent to the API. Now both converters behave consistently.

File: pkg/model/provider/anthropic/beta_converter.go

3. `fix(#2053): add reverse validation and merge-based repair for tool sequencing`

Validation + repair hardening.

validateSequencing now checks both directions: forward (assistant tool_use → next user has tool_results) and reverse (user tool_results → previous assistant has tool_use).
repairSequencing now merges missing tool_results into the existing next user message instead of inserting a separate synthetic message (which would orphan the existing results).

File: pkg/model/provider/anthropic/client.go

4. `test(#2053): add coverage for tool_use/tool_result sequencing fixes`

6 new tests covering reverse validation, repair merge behavior, beta converter orphan dropping, and the normal tool_use/tool_result flow. Runtime event tests updated for the new SessionID field.

Files: pkg/model/provider/anthropic/client_test.go, pkg/runtime/runtime_test.go

Testing

task test — all tests pass
go vet — clean
New tests verify each fix independently

…sistence During task transfers, sub-agent streaming events (AgentChoiceEvent, AgentChoiceReasoningEvent, MessageAddedEvent) were forwarded through the parent session's event channel. The PersistentRuntime's handleEvent checked sess.IsSubSession(), but sess was always the parent session, so the guard never triggered. This caused sub-agent assistant messages to be persisted directly into the parent session's message history, corrupting it. On session restore, the parent session contained interleaved sub-agent messages with tool_use blocks that had no corresponding tool_result messages in the parent context, causing Anthropic API errors: "unexpected tool_use_id found in tool_result blocks". Add SessionID field to AgentChoiceEvent and AgentChoiceReasoningEvent, and filter all streaming/message events by comparing the event's SessionID against the parent session's ID. Events from sub-sessions are now silently skipped during persistence (they are persisted separately via SubSessionCompletedEvent). Assisted-By: docker-agent

The non-beta convertMessages has a pendingAssistantToolUse flag that only includes tool_result user messages when they immediately follow an assistant message with tool_use blocks. Orphan tool results from corrupted session history are silently dropped. The beta convertBetaMessages had no such guard — every tool role message was unconditionally converted to a tool_result block. When the session history contained orphan tool results (e.g. from sub-agent messages that leaked into the parent session), they passed straight through to the Anthropic API, causing: "unexpected tool_use_id found in tool_result blocks" Add the same pendingAssistantToolUse tracking to convertBetaMessages to match the non-beta converter behavior. Assisted-By: docker-agent

…quencing validateSequencing only checked the forward direction: every assistant tool_use must have a matching tool_result in the next user message. It did not check the reverse: every tool_result must reference a tool_use in the immediately preceding assistant message. Orphan tool_results passed validation silently. Additionally, repairSequencing inserted a synthetic user message with missing tool_results *before* an existing user message that already had partial tool_results. This split tool_results across two user messages, causing the existing results to become orphaned: assistant(tool_use: A, B) synthetic_user(tool_result: B) <- repair inserted this user(tool_result: A) <- A now orphaned (previous is synthetic_user) Two fixes: 1. validateSequencing now also checks the reverse direction: for each user message containing tool_result blocks, verify the immediately preceding message is an assistant with corresponding tool_use IDs. 2. repairSequencing now merges missing tool_results into the existing next user message instead of inserting a separate synthetic message. This keeps all tool_results in a single user message adjacent to the assistant, satisfying both forward and reverse invariants. Assisted-By: docker-agent

Add tests covering: - Reverse validation: orphan tool_results referencing non-existent tool_use in the preceding assistant message are detected. - Reverse validation: tool_results with no preceding assistant message at all are detected. - Repair merge: when partial tool_results exist in the next user message, missing results are merged into it (not inserted as a separate synthetic message that would break sequencing). - Repair insert: when there is no next user message at all, a synthetic user message is correctly inserted. - Beta converter: orphan tool_results (no preceding assistant tool_use) are dropped by the pendingAssistantToolUse guard. - Beta converter: normal tool_use/tool_result flow continues to work correctly with the guard in place. - Runtime event tests updated for SessionID field added to AgentChoiceEvent and AgentChoiceReasoningEvent. Assisted-By: docker-agent

aheritier · 2026-03-11T13:33:21Z

/review

docker-agent

Review Summary

Assessment: 🟢 APPROVE

This PR effectively fixes issue #2053 by preventing corrupted session history from breaking Anthropic API calls. The implementation is sound:

Root cause fix (SessionID filtering):

✅ Adds SessionID field to AgentChoiceEvent and AgentChoiceReasoningEvent
✅ Correctly filters sub-agent streaming events using if e.SessionID != sess.ID { return }
✅ Consistent pattern across all event types (AgentChoice, AgentChoiceReasoning, MessageAdded)

Defense layer (orphan tool result dropping):

✅ Adds pendingAssistantToolUse boolean flag to beta converter
✅ Correctly drops orphan tool results when flag is false
✅ Flag is properly reset after processing tool results

Validation + repair hardening:

✅ Reverse validation checks user tool_results reference existing assistant tool_use blocks
✅ Merge-based repair correctly checks next message role before merging
✅ Handles consecutive assistants by inserting synthetic user message (not merging)

Test coverage:

✅ 6 new tests cover reverse validation, merge behavior, orphan dropping, and normal flow
✅ Runtime event tests updated for SessionID field

The logic is correct and well-structured. All edge cases identified during review are properly handled.

Findings

No issues found. The changes are ready to merge.

aheritier added 2 commits March 11, 2026 13:57

aheritier requested a review from a team as a code owner March 11, 2026 13:03

This comment was marked as outdated.

Sign in to view

aheritier force-pushed the fix/issue-2053 branch from ded3735 to 9f7f944 Compare March 11, 2026 13:12

aheritier added 2 commits March 11, 2026 14:18

aheritier force-pushed the fix/issue-2053 branch from 9f7f944 to 7220241 Compare March 11, 2026 13:18

docker-agent bot reviewed Mar 11, 2026

View reviewed changes

aheritier marked this pull request as draft March 12, 2026 12:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(#2053): prevent corrupted session history from breaking Anthropic API calls#2058

fix(#2053): prevent corrupted session history from breaking Anthropic API calls#2058
aheritier wants to merge 4 commits intomainfrom
fix/issue-2053

aheritier commented Mar 11, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

aheritier commented Mar 11, 2026

Uh oh!

docker-agent bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aheritier commented Mar 11, 2026

Summary

Root Cause

Changes

1. fix(#2053): filter sub-agent streaming events from parent session persistence

2. fix(#2053): add pendingAssistantToolUse guard to beta message converter

3. fix(#2053): add reverse validation and merge-based repair for tool sequencing

4. test(#2053): add coverage for tool_use/tool_result sequencing fixes

Testing

Uh oh!

This comment was marked as outdated.

Uh oh!

aheritier commented Mar 11, 2026

Uh oh!

docker-agent bot left a comment

Choose a reason for hiding this comment

Review Summary

Assessment: 🟢 APPROVE

Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `fix(#2053): filter sub-agent streaming events from parent session persistence`

2. `fix(#2053): add pendingAssistantToolUse guard to beta message converter`

3. `fix(#2053): add reverse validation and merge-based repair for tool sequencing`

4. `test(#2053): add coverage for tool_use/tool_result sequencing fixes`