Skip to content

fix(providers/openai): emit item_reference for reasoning when store=true#30

Open
ibetitsmike wants to merge 1 commit intocoder_2_33from
mike/fix-reasoning-replay-item-reference
Open

fix(providers/openai): emit item_reference for reasoning when store=true#30
ibetitsmike wants to merge 1 commit intocoder_2_33from
mike/fix-reasoning-replay-item-reference

Conversation

@ibetitsmike
Copy link
Copy Markdown

Disclaimer: opened by a Coder Agent on behalf of @ibetitsmike.

Fixes the gpt-5.5 xhigh (and other Responses API reasoning models) follow-up turn failure when an assistant turn produces reasoning + a provider-executed web_search_call (or any item that requires a preceding reasoning item).

Problem

When Store: true is enabled and an assistant turn contains both a reasoning item and a following provider-executed item (e.g. web_search_call), the OpenAI Responses API requires the reasoning item to be replayed alongside the following item on the next turn. Otherwise it rejects the request with:

Item 'ws_xxx' of type 'web_search_call' was provided without its
required 'reasoning' item: 'rs_xxx'.

Previously toResponsesPrompt unconditionally skipped reasoning items. The skip was introduced to fix charmbracelet#181 (inline OfReasoning replay was rejected when no following item was present), but it produces a dual failure mode for any conversation that combines reasoning with a provider-executed tool call or function call.

Fix

  • store=true: replay the reasoning item via item_reference using the persisted ItemID from ResponsesReasoningMetadata. Mirrors how provider-executed tool calls are already replayed and matches the OpenAI documented contract for stored items.
  • store=false: server-side reasoning IDs are ephemeral and cannot be referenced. Provider-executed tool calls are likewise skipped, so there is nothing to pair with. Behavior is unchanged.
  • Defensive fallback: if ItemID is missing from metadata, the reasoning part is skipped (rest of the assistant message still replays).

Inline OfReasoning replay is intentionally not used to avoid re-introducing the failure mode that charmbracelet#181 fixed.

Tests

Updated and added in providers/openai/openai_test.go:

  • TestResponsesToPrompt_ReasoningWithStore — now asserts store=true emits item_reference for the reasoning item; covers the no-ItemID skip path; keeps the store=false skip assertion.
  • TestResponsesToPrompt_ReasoningWithWebSearchCombined — full reasoning + web_search_call replay with order assertions (user, item_reference(rs), item_reference(ws), text, user).
  • TestResponsesToPrompt_ReasoningWithFunctionCallCombined — reasoning + function_call + function_call_output round-trip with order assertions.

Full suites pass:

  • go test ./providers/openai/...
  • go test ./...

Related

Implementation plan and decision log

Follow-up to closed PR #7. Plan kept the change minimal:

  1. Red: extended TestResponsesToPrompt_ReasoningWithStore and added two combined-replay tests; all failed against the previous unconditional skip.
  2. Green: replaced the unconditional continue with a store-gated item_reference emission using GetReasoningMetadata. Build + tests green.
  3. Refactor: not needed — pattern only appears once.

Decisions:

When an OpenAI Responses API reasoning model produces both a reasoning
item and a provider-executed item (e.g. web_search_call) in the same
response, the API requires the reasoning item to be replayed alongside
the following item on the next turn. Without this pairing the API
rejects the request with:

  Item 'ws_xxx' of type 'web_search_call' was provided without its
  required 'reasoning' item: 'rs_xxx'.

Previously toResponsesPrompt unconditionally skipped reasoning items
during replay, both for store=true and store=false. The skip was
introduced to fix charmbracelet#181 (inline OfReasoning replay
was rejected when no following item was present), but it now produces
the dual failure mode for any conversation that combines reasoning with
a provider-executed tool call or function call.

Fix: when store=true, replay the reasoning item via item_reference
using the persisted ItemID stored in ResponsesReasoningMetadata. This
mirrors how provider-executed tool calls are already replayed and
matches the OpenAI documented contract for stored items.

When store=false, server-side reasoning IDs are ephemeral and cannot
be referenced; provider-executed tool calls are likewise skipped, so
there is nothing to pair with. Behavior in this mode is unchanged.

Tests:
- TestResponsesToPrompt_ReasoningWithStore: now asserts that store=true
  emits item_reference for the reasoning item; covers the
  no-ItemID skip path; keeps the store=false skip assertion.
- TestResponsesToPrompt_ReasoningWithWebSearchCombined: full reasoning
  + web_search_call replay with assertions on order
  (user, item_reference(rs), item_reference(ws), text, user).
- TestResponsesToPrompt_ReasoningWithFunctionCallCombined: reasoning +
  function_call + function_call_output round-trip with order assertions.

Generated by Coder Agents.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant