fix(server): streamableHttp stores per-request events while stream disconnected (SEP-1699 poll-and-replay)#2342
Conversation
🦋 Changeset detectedLatest commit: 9c2aee4 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@modelcontextprotocol/client
@modelcontextprotocol/codemod
@modelcontextprotocol/server
@modelcontextprotocol/server-legacy
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
There was a problem hiding this comment.
I didn't find any bugs in this change, but it rewires the per-request send()/replay lifecycle of the Streamable HTTP transport (store-first semantics, new no-writer branches, identity-guarded cancels, replay dedup), which is concurrency-sensitive core transport behavior — worth a careful human look.
Extended reasoning...
Overview
The PR changes packages/server/src/server/streamableHttp.ts so that send() stores request-related events to the configured eventStore whenever the request is in flight, regardless of whether a live SSE writer exists, plus several supporting guards: identity-checked cancel() callbacks on all three ReadableStream sites, replayedEventIds dedup on resumed streams, a re-read of _streamMapping after the storeEvent() await, retired-request close on replay, and new branching in the all-responses-ready path (clean return with eventStore, onerror without one, unchanged throw in JSON mode). It also folds in a JSON-mode stream.cleanup() fix. Seven new vitest regression tests plus a changeset accompany the change.
Security risks
Low. Replay remains gated by the existing Mcp-Session-Id validation and Last-Event-ID GET path; streamId/eventId are server-minted and storage is not keyed on any client-supplied identifier. The main risk class is availability/correctness (event duplication, dropped responses, leaked stream-map entries) rather than injection or auth bypass.
Level of scrutiny
High. This is the core Streamable HTTP server transport used by every SSE-mode server with resumability, and the change deliberately reorders awaits and rewrites stream-lifecycle bookkeeping to handle reconnect races (TOCTOU around storeEvent(), stale cancels, retired-request replay). Behavior also changes observably: a final response with no live writer no longer throws when an eventStore is configured, and surfaces via onerror when one isn't. These are exactly the kinds of subtle async/lifecycle semantics the repo's recurring-catch list calls out, and they deserve maintainer judgment on the chosen semantics (e.g. silently retiring vs. throwing) even though I found no concrete defects.
Other factors
The bug-hunting pass found no issues, the new tests cover each of the claimed races (closeSSE drop, disconnect-in-flight, reconnect replay, retired-request close, store-await TOCTOU, stale cancel, no-eventStore onerror), and the changeset prose matches the implemented behavior. Still, the breadth of behavioral change in a single transport file and the reliance on timing-based tests (setTimeout settling) make this unsuitable for auto-approval.
… is disconnected The standalone-SSE path stores to eventStore first, then writes if connected. The request-related path only stored when the stream was live, so a notification sent after closeSSE() (SEP-1699 polling) was silently dropped instead of being persisted for replay on reconnect. Exposed by the ctx.mcpReq.log request-related change against the new sse-polling example story; the gap pre-exists on main for any request-related notification (progress, ctx.mcpReq.notify) emitted after closeSSE().
5e1f24a to
9c2aee4
Compare
There was a problem hiding this comment.
I didn't find any bugs — the new tests cover the disconnect/replay races well — but this changes the core send() delivery path of the sessionful Streamable HTTP transport (store-first semantics, post-await re-read, cancel identity guards) and shifts the undeliverable-final-response behavior from throwing to a clean return / onerror, so it warrants a human look.
Extended reasoning...
Overview
This PR changes WebStandardStreamableHTTPServerTransport in packages/server/src/server/streamableHttp.ts so that request-related events and the final response are stored to the configured eventStore whenever a request is in flight, regardless of whether a live SSE writer exists (SEP-1699 poll-and-replay). It also adds several concurrency guards: identity checks in ReadableStream cancel callbacks, a replayedEventIds dedup set, a re-read of _streamMapping after the storeEvent() await, and close/unregister of a resumed stream when the request was already retired. It includes 7 new regression tests, a changeset, and re-enables the sse-polling example in CI.
Security risks
No meaningful security exposure: no auth, session-ID generation, or input-validation logic is touched. The main risk class is correctness/lifecycle (lost or duplicated events, leaked stream mappings), not data exposure or injection.
Level of scrutiny
High. This is production-critical transport code with subtle async/race semantics. Beyond the storage fix itself, there is a behavior change worth a maintainer's judgment: when all responses are ready but no stream entry exists, send() previously threw "No connection established"; it now returns cleanly (eventStore configured) or surfaces via onerror (no eventStore) and retires the request id. That is likely the right call per the spec's "disconnection SHOULD NOT be interpreted as cancellation", but it changes what callers of send() observe and is the kind of decision a human maintainer should sign off on. The folded-in JSON-mode cleanup() after resolveJson() is a small, plausible leak fix.
Other factors
The PR description is thorough and the test coverage is unusually good for the race conditions it claims to handle (TOCTOU during storeEvent(), stale cancel, retire-then-reconnect). The bug hunting system found no issues, and I did not find any either. Still, given the complexity and the criticality of this code path for all sessionful HTTP deployments, this should not be auto-approved.
…sconnected (SEP-1699 poll-and-replay) (#2342)
WebStandardStreamableHTTPServerTransport.send()now stores request-related events toeventStorewhenever the request is in flight, regardless of whether a live SSE writer exists — mirroring the standalone-SSE store-first path.Scope: legacy sessionful Streamable HTTP transport only. The 2026-07-28 per-request path (
createMcpHandler→PerRequestHTTPServerTransport) does not usecloseSSE/eventStore; this change is not on that path. Relevant for direct sessionful-transport deployments and dual-era hosts serving 2025 clients.Motivation and Context
The request-related path only stored to
eventStorewhen the stream controller was live, so events emitted aftercloseSSE()(SEP-1699 poll-and-replay) or a transient client drop were silently lost instead of replayed onLast-Event-IDreconnect. Per 2025-11-25 transports.mdx, disconnection SHOULD NOT be interpreted as the client cancelling its request. Storage is now keyed on request-in-flight (_requestToStreamMappingresolves the server-mintedstreamId) rather than on_streamMappinghaving a live writer; concurrent-lifecycle guards added (per-streamId cancel identity check,replayedEventIdsdedup, re-read after thestoreEvent()await, close-on-replay-after-retire). A final response sent while disconnected with aneventStoreis stored and returns cleanly; without one it surfaces viaonerror. Folded in: JSON-mode now callscleanup()afterresolveJson().How Has This Been Tested?
streamableHttp.test.ts(closeSSE/disconnect storage, reconnect replay, retired-request close, post-await re-read, stale-cancel guard, no-eventStore onerror)Breaking Changes
None.
Types of changes
Checklist
Additional context
Single commit over
7c0833b4b; zero file overlap with #2335.