fix: persist user input items to state.messages across callModel invocations#46
fix: persist user input items to state.messages across callModel invocations#46Cybourgeoisie wants to merge 5 commits into
Conversation
…cations User input items (role: 'user') were never written to state.messages. The two existing saveStateSafely write sites only persist response.output and toolResults. Within a single callModel invocation this is masked because makeFollowupRequest reconstructs input from the in-memory this.resolvedRequest.input. But when a new callModel resumes from persisted state, the loaded state.messages contains zero user items — prior user turns are silently dropped. This causes two problems: 1. cache_control prompt caching is defeated at every user-message boundary (cachedTokens = 0 on first request of each new message) 2. Conversation fidelity loss — the model never sees prior user turns Fix: In initStream(), after request resolution, persist fresh user input items to state.messages via saveStateSafely before the API call. Both code paths (loaded history and no history) now capture freshItemsForState and write it to state. Co-Authored-By: Ben Heidorn <ben.heidorn@openrouter.ai>
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
Move the fresh-items save from before the API call (initStream) to saveResponseToState, which runs only after a successful response. This prevents duplicate user turns in state when a caller retries after an API failure. Add retry-after-failure idempotency test to verify the fix. Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
|
🤖 PR Patrol — partial run (blocked on repo access) Patrol reviewed the 3 unresolved threads and prepared fixes, but cannot push or resolve threads on this repo — none of Perry's GitHub Apps (reviewer / maintainer / intern-worker) are installed on Thread review:
Action needed (human): either install a Perry GitHub App on |
…Image to OutputImage Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Perry's Review
Defers user input items into pendingFreshItems and flushes them into state.messages only after a successful API response, fixing the root bug where user input was persisted before the call could fail.
Verdict: 🔁 Needs changes
Details
CI: All 5 checks pass (structural-gate, lint, unit-tests, typecheck, e2e-tests). Note: structural-gate enforces layer coupling but does NOT detect public API breakage from export removal.
Findings (see inline comments for full context):
🔴 blocker — packages/agent/src/index.ts:47 — OutputInputImage removed from public re-exports without a backward-compat alias or changeset, silently breaking consumers who import this type from the agent package.
Codex (HEAVY_SECONDARY_MODEL): Skipped — medium tier (420 LoC)
Research: Skipped — medium tier
Security: No secrets or credentials in the diff. The change is purely a state-management fix. No authentication, authorization, prompt injection, or tenant isolation surfaces are touched.
Test coverage: New test file (user-input-persistence.test.ts, 390 lines, 6 test cases) covers key scenarios well: user input persisted after no-tool call, string input normalization, user input alongside tool results, second callModel seeing prior user input on resume, both invocations' messages in final state, and no duplication on retry after API failure. One nit: the retry-after-failure test uses storedState?.messages (optional chain) instead of the non-optional form used consistently elsewhere in the file, which could mask a regression where state is not saved before the API call at all.
Unresolved threads: 3 threads exist. Thread 1 (outdated): Perry flagged user input persisted before the API call — correctly addressed by this PR's pendingFreshItems deferral. Thread 2 (unresolved): Perry flagged missing retry-after-failure test — addressed by commit b6ab887 but not resolved by the author. Thread 3 (nit, unresolved): Perry flagged storedState?.messages optional chain at test line 354 — still present in final code.
Scope: first review (full)
Tier: medium (420 LoC)
| // Output item types (StreamableOutputItem members) | ||
| OutputFileSearchCallItem, | ||
| OutputFunctionCallItem, | ||
| OutputImage, |
There was a problem hiding this comment.
[blocker] OutputInputImage removed from public re-exports without a backward-compat alias or changeset — this is a silent breaking change for consumers.
Details
Why: OutputInputImage was deliberately added as a public re-export in PR #20 so consumers could use it without a direct SDK dependency (CHANGELOG: "Re-export EasyInputMessageContentInputImage, OutputInputImage, and OpenAIResponsesToolChoiceUnion from the SDK models so consumers can use these types without a direct SDK dependency."). This PR removes it (replaced by OutputImage) with no backward-compat shim and no changeset. Any downstream package doing import type { OutputInputImage } from 'the agent package' now gets a compile error. The changeset initially added to this PR was removed in commit d1e4831 without justification.
Fix (preferred — zero version bump):
// packages/agent/src/index.ts
export type {
OutputImage,
OutputImage as OutputInputImage, // backward-compat alias
...
}Alternative: restore a changeset with a minor bump and document the removal as a breaking change.
Prompt for agents
In packages/agent/src/index.ts, add OutputImage as OutputInputImage to the re-export list alongside OutputImage to preserve the previously-announced public type alias, then verify bun run build and bun test pass.
Reviewed at c930772
|
🤖 PR Patrol — partial run, blocked on installation permissions I prepared fixes for the open review threads locally (verified passing: 294 tests, typecheck, lint), but cannot apply them or resolve threads on this repo because the GitHub Apps installed on
This is the same installation limitation noted in the earlier review ("APPROVE unavailable… the maintainer GitHub App does not have Threads & status:
Action needed from a human/maintainer: apply the two prepared changes (the Prepared diffs (for a maintainer to apply)# packages/agent/src/index.ts
OutputFunctionCallItem,
OutputImage,
+ // Backward-compat alias: the SDK renamed `OutputInputImage` to `OutputImage`.
+ // Re-export the old name so consumers importing `OutputInputImage` keep compiling.
+ OutputImage as OutputInputImage,
OutputImageGenerationCallItem,# packages/agent/tests/unit/user-input-persistence.test.ts (retry-after-failure test)
- // State should NOT contain the user message after a failed API call
- const messagesAfterFailure = storedState?.messages as
- | Array<{
- role?: string;
- }>
- | undefined;
- const userItemsAfterFailure = messagesAfterFailure?.filter((m) => m.role === 'user') ?? [];
- expect(userItemsAfterFailure.length).toBe(0);
+ // State should NOT contain the user message after a failed API call.
+ // Assert storedState is populated first so the user-message check below
+ // fails loudly on regression rather than passing vacuously via `?? []`.
+ expect(storedState).not.toBeNull();
+ const messagesAfterFailure = (storedState!.messages ?? []) as Array<{
+ role?: string;
+ }>;
+ const userItemsAfterFailure = messagesAfterFailure.filter((m) => m.role === 'user');
+ expect(userItemsAfterFailure.length).toBe(0); |
Summary
role: 'user') were never written tostate.messages— the two existingsaveStateSafelywrite sites only persist response outputs and tool resultscallModelresumes from persisted state, the loadedstate.messagescontained zero user items, silently dropping prior user turnscache_controlprompt caching at user-message boundaries and caused conversation fidelity loss (the model never sees prior user turns on resume)Changes
initStream()and persist them tostate.messagesviasaveStateSafelybefore the API requestTest plan
packages/agent/tests/unit/user-input-persistence.test.ts— 5 tests)🤖 Generated with Claude Code