Skip to content

fix(#2053): prevent corrupted session history from breaking Anthropic API calls#2058

Draft
aheritier wants to merge 4 commits intomainfrom
fix/issue-2053
Draft

fix(#2053): prevent corrupted session history from breaking Anthropic API calls#2058
aheritier wants to merge 4 commits intomainfrom
fix/issue-2053

Conversation

@aheritier
Copy link
Contributor

Summary

Fixes #2053 — After restoring a session that involved transfer_task (sub-agent delegation), all subsequent prompts fail with:

unexpected tool_use_id found in tool_result blocks: tooluse_PDwaDa9qmwIyg9WUMFXlez.
Each tool_result block must have a corresponding tool_use block in the previous message.

Root Cause

PersistentRuntime.handleEvent() persists all streaming events to the parent session, including events from sub-agent executions during task transfers. The existing sess.IsSubSession() guard never triggers because sess is always the parent session. This corrupts the parent session's message history with interleaved sub-agent messages containing orphan tool_use/tool_result blocks.

Changes

1. fix(#2053): filter sub-agent streaming events from parent session persistence

Root cause fix. Adds SessionID field to AgentChoiceEvent and AgentChoiceReasoningEvent. All streaming and message events in handleEvent are now filtered by comparing the event's SessionID against the parent session's ID. Sub-agent events are silently skipped (persisted separately via SubSessionCompletedEvent).

Files: pkg/runtime/event.go, pkg/runtime/streaming.go, pkg/runtime/persistent_runtime.go

2. fix(#2053): add pendingAssistantToolUse guard to beta message converter

Defense layer. The non-beta convertMessages already had a pendingAssistantToolUse flag that drops orphan tool results. The beta convertBetaMessages lacked this guard — every tool role message was unconditionally sent to the API. Now both converters behave consistently.

File: pkg/model/provider/anthropic/beta_converter.go

3. fix(#2053): add reverse validation and merge-based repair for tool sequencing

Validation + repair hardening.

  • validateSequencing now checks both directions: forward (assistant tool_use → next user has tool_results) and reverse (user tool_results → previous assistant has tool_use).
  • repairSequencing now merges missing tool_results into the existing next user message instead of inserting a separate synthetic message (which would orphan the existing results).

File: pkg/model/provider/anthropic/client.go

4. test(#2053): add coverage for tool_use/tool_result sequencing fixes

6 new tests covering reverse validation, repair merge behavior, beta converter orphan dropping, and the normal tool_use/tool_result flow. Runtime event tests updated for the new SessionID field.

Files: pkg/model/provider/anthropic/client_test.go, pkg/runtime/runtime_test.go

Testing

  • task test — all tests pass
  • go vet — clean
  • New tests verify each fix independently

…sistence

During task transfers, sub-agent streaming events (AgentChoiceEvent,
AgentChoiceReasoningEvent, MessageAddedEvent) were forwarded through
the parent session's event channel. The PersistentRuntime's
handleEvent checked sess.IsSubSession(), but sess was always the
parent session, so the guard never triggered. This caused sub-agent
assistant messages to be persisted directly into the parent session's
message history, corrupting it.

On session restore, the parent session contained interleaved sub-agent
messages with tool_use blocks that had no corresponding tool_result
messages in the parent context, causing Anthropic API errors:
"unexpected tool_use_id found in tool_result blocks".

Add SessionID field to AgentChoiceEvent and AgentChoiceReasoningEvent,
and filter all streaming/message events by comparing the event's
SessionID against the parent session's ID. Events from sub-sessions
are now silently skipped during persistence (they are persisted
separately via SubSessionCompletedEvent).

Assisted-By: docker-agent
The non-beta convertMessages has a pendingAssistantToolUse flag that
only includes tool_result user messages when they immediately follow
an assistant message with tool_use blocks. Orphan tool results from
corrupted session history are silently dropped.

The beta convertBetaMessages had no such guard — every tool role
message was unconditionally converted to a tool_result block. When
the session history contained orphan tool results (e.g. from
sub-agent messages that leaked into the parent session), they passed
straight through to the Anthropic API, causing:

  "unexpected tool_use_id found in tool_result blocks"

Add the same pendingAssistantToolUse tracking to convertBetaMessages
to match the non-beta converter behavior.

Assisted-By: docker-agent
@aheritier aheritier requested a review from a team as a code owner March 11, 2026 13:03
docker-agent[bot]

This comment was marked as outdated.

…quencing

validateSequencing only checked the forward direction: every assistant
tool_use must have a matching tool_result in the next user message. It
did not check the reverse: every tool_result must reference a tool_use
in the immediately preceding assistant message. Orphan tool_results
passed validation silently.

Additionally, repairSequencing inserted a synthetic user message with
missing tool_results *before* an existing user message that already
had partial tool_results. This split tool_results across two user
messages, causing the existing results to become orphaned:

  assistant(tool_use: A, B)
  synthetic_user(tool_result: B)   <- repair inserted this
  user(tool_result: A)             <- A now orphaned (previous is synthetic_user)

Two fixes:

1. validateSequencing now also checks the reverse direction: for each
   user message containing tool_result blocks, verify the immediately
   preceding message is an assistant with corresponding tool_use IDs.

2. repairSequencing now merges missing tool_results into the existing
   next user message instead of inserting a separate synthetic message.
   This keeps all tool_results in a single user message adjacent to the
   assistant, satisfying both forward and reverse invariants.

Assisted-By: docker-agent
Add tests covering:

- Reverse validation: orphan tool_results referencing non-existent
  tool_use in the preceding assistant message are detected.
- Reverse validation: tool_results with no preceding assistant
  message at all are detected.
- Repair merge: when partial tool_results exist in the next user
  message, missing results are merged into it (not inserted as a
  separate synthetic message that would break sequencing).
- Repair insert: when there is no next user message at all, a
  synthetic user message is correctly inserted.
- Beta converter: orphan tool_results (no preceding assistant
  tool_use) are dropped by the pendingAssistantToolUse guard.
- Beta converter: normal tool_use/tool_result flow continues to work
  correctly with the guard in place.
- Runtime event tests updated for SessionID field added to
  AgentChoiceEvent and AgentChoiceReasoningEvent.

Assisted-By: docker-agent
@aheritier
Copy link
Contributor Author

/review

Copy link

@docker-agent docker-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Assessment: 🟢 APPROVE

This PR effectively fixes issue #2053 by preventing corrupted session history from breaking Anthropic API calls. The implementation is sound:

Root cause fix (SessionID filtering):

  • ✅ Adds SessionID field to AgentChoiceEvent and AgentChoiceReasoningEvent
  • ✅ Correctly filters sub-agent streaming events using if e.SessionID != sess.ID { return }
  • ✅ Consistent pattern across all event types (AgentChoice, AgentChoiceReasoning, MessageAdded)

Defense layer (orphan tool result dropping):

  • ✅ Adds pendingAssistantToolUse boolean flag to beta converter
  • ✅ Correctly drops orphan tool results when flag is false
  • ✅ Flag is properly reset after processing tool results

Validation + repair hardening:

  • ✅ Reverse validation checks user tool_results reference existing assistant tool_use blocks
  • ✅ Merge-based repair correctly checks next message role before merging
  • ✅ Handles consecutive assistants by inserting synthetic user message (not merging)

Test coverage:

  • ✅ 6 new tests cover reverse validation, merge behavior, orphan dropping, and normal flow
  • ✅ Runtime event tests updated for SessionID field

The logic is correct and well-structured. All edge cases identified during review are properly handled.

Findings

No issues found. The changes are ready to merge.

@aheritier aheritier marked this pull request as draft March 12, 2026 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Restored sessions fail with "unexpected tool_use_id found in tool_result blocks" (Beta API path)

1 participant