fix(observability): capsule.request.durationMs spans full request lifecycle#44
Merged
Merged
Conversation
…ecycle 48h of v1.6.0 production traffic surfaced a metric-fidelity bug: the `capsule.request.durationMs` event measured only TTFB, not body stream + JSON parse. Visible on `list_entries` (tenant-wide feed, large body): dashboard showed 542 ms while wall-clock was 2720 ms. Other endpoints were too small to expose the gap (<10 ms body read). Refactor: doFetch() now returns a RequestStart bookkeeping object and no longer emits. A new consumeBody() helper wraps body consumption in a try/finally that emits at scope exit, so durationMs spans the full request (issued → body fully read), and the emit fires on both success and body-read errors (4xx/5xx, malformed JSON, etc). Touches all 9 callers in src/capsule/client.ts. One new test pins the new contract by mocking a slow body-read stream and asserting durationMs reflects the delay. 470 tests (was 469); bundle +0.72 KB. Also: bumped batch_list_party_entries to top of the IDEAS.md "child-list fan-out" ranking — the same 48h data showed 40/62 tool calls (65 %) were list_party_entries with 1:1 Capsule fan-out, the canonical N+1 the entry was speculating about. Out of scope for the patch (new tool = minor bump). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
capsule.request.durationMsmeasured only TTFB. 48 h of production data made the gap visible:list_entriesreported 542 ms in the dashboard whiletool.callreported 2720 ms wall-clock — the 2178 ms gap was body stream + JSON parse, invisible to the metric.doFetch()now returns aRequestStartbookkeeping object instead of emitting; a newconsumeBody()helper wraps body consumption intry/finallyso the emit fires after the body has been read (on both success and error paths).src/capsule/client.tsupdated. One new test pins the contract via a slow-bodyReadableStream. 470 tests (was 469). Bundle +0.72 KB.Background — why this surfaced now
The 48 h v1.6.0 traffic sample (post-deploy verification window) had 62 tool calls across 7 distinct tools. Six of them showed
tool.call.durationMs ≈ capsule.request.durationMswith <10 ms overhead — looked fine.list_entrieswas the outlier: 2720 ms vs 542 ms. After tracing, the gap mapped entirely tores.json()time afterfetch()returned headers. Other endpoints had small bodies so the gap was hidden; the global-entries feed surfaces the worst case.What this is not
list_entriescall (that's just Capsule sending a big body — out of our control).durationMsis still a single number, just a more accurate one.list_party_entries, which was the OTHER finding from the 48 h sample (65 % of all calls). Addingbatch_list_party_entriesis minor-bump territory; filed in IDEAS.md with the new evidence.Test plan
npm run typecheckcleannpm run lint+npm run format:checkcleannpm run buildclean (148.15 KB stdio / 174.62 KB HTTP, +~0.7 KB)npm test470/470 pass (was 469; +1 new test)durationMscapsule.requesttest still passes (the duration was typed asnumberand the test didn't pin a specific value, so the refactor is contract-compatible)tool.call.durationMs - capsule.request.durationMsforlist_entries— should drop near zero (overhead → ~0 ms), confirming the metric now captures what was previously hidden.🤖 Generated with Claude Code