Skip to content

fix(tracing): key root runs by user message id#74

Merged
dialupdisaster merged 3 commits into
mainfrom
fix/run-scoped-traces
Jun 24, 2026
Merged

fix(tracing): key root runs by user message id#74
dialupdisaster merged 3 commits into
mainfrom
fix/run-scoped-traces

Conversation

@dialupdisaster

@dialupdisaster dialupdisaster commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR replaces the earlier session-keyed trace-parenting approach with a run model keyed by the originating user message ID.

OpenCode does not expose a first-class runID, but it does expose enough structure to derive one reliably:

  • chat.message can provide the user messageID
  • stored UserMessage objects have stable ids
  • every AssistantMessage has parentID, which points back to the originating user message
  • tool/part events carry the assistant messageID

This PR uses that model so root traces stay tied to the correct user turn even when a single sessionID is reused across multiple runs.

What changed

  • key root run spans by user message ID instead of session ID
  • track the current active run per session separately from retained ended run contexts
  • map assistant message IDs back to their parent user-message run via assistant.parentID
  • parent LLM/tool child spans through that assistant-to-run mapping
  • preserve session-scoped totals and logs independently from run-scoped trace parenting
  • harden the test tracer so each fake span has a unique span context
  • add regression coverage for:
    • late child events after idle
    • run1 -> run2 -> late child from run1
    • root fallback parenting when no parent run is available

Why

The original trace fragmentation bug came from using session.created / live session spans as the trace root for work that is actually turn-scoped.

A second user prompt in the same OpenCode session is a new run, not a continuation of the previous trace root.

Keying retained parent context by sessionID was still unsafe because a new run could overwrite the old run context before late events from the previous run arrived. Keying by user message ID removes that collision.

Notes

  • promptless root sessions still emit metrics/logs but intentionally do not emit a root trace span
  • primary sessions do not create trace roots on session.created; root traces start from user turns
  • subagent sessions still get session spans, parented to the active root run

Validation

  • bun run typecheck
  • bun test

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@dialupdisaster, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 34 minutes and 44 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 04b7ce95-3ad1-4d27-9e09-e9c9f68f2800

📥 Commits

Reviewing files that changed from the base of the PR and between 8e8e5ca and 0e43ecc.

📒 Files selected for processing (9)
  • .github/dependabot.yml
  • src/handlers/message.ts
  • src/handlers/session.ts
  • src/index.ts
  • src/types.ts
  • src/util.ts
  • tests/handlers/disabled-signals.test.ts
  • tests/handlers/spans.test.ts
  • tests/helpers.ts
📝 Walkthrough

Walkthrough

The PR adds run-level span tracking to session telemetry, centralizes trace-context resolution across session, tool, and LLM spans, updates plugin wiring to carry the new context maps, and extends tests and helpers for the revised parenting and lifecycle behavior.

Changes

Run-aware tracing context

Layer / File(s) Summary
Shared tracing state and resolver
src/types.ts, src/util.ts
HandlerContext gains run/session context maps, and resolveSessionTraceContext() resolves a parent context from session or run-root state.
Plugin wiring and run bootstrap
src/index.ts
The plugin initializes the new tracing maps, threads them through ctx, rewrites session totals updates, and starts run instrumentation from chat.message when no session span exists.
Session and run span lifecycle
src/handlers/session.ts
handleRunStarted stores run spans and contexts, handleSessionCreated resolves parent context through the shared helper, and idle/error handlers end the matching run span.
Tool and LLM span parenting
src/handlers/message.ts
Tool span updates and startMessageSpan now use resolveSessionTraceContext() for parent selection, and the LLM span comment reflects run/subagent parenting.
Test harness and span assertions
tests/helpers.ts, tests/handlers/spans.test.ts
The test tracer records parent span context, and the suite asserts run-span parenting, lifecycle endings, and trace-disable behavior against the new context maps.

Sequence Diagram(s)

sequenceDiagram
  participant chatMessage as chat.message
  participant handleRunStarted
  participant handleSessionCreated
  participant resolveSessionTraceContext
  participant ctxRunSpans as ctx.runSpans
  participant ctxSessionRunRoots as ctx.sessionRunRoots
  participant handleSessionIdle
  participant handleSessionError

  chatMessage->>handleRunStarted: start run span when sessionSpan is missing
  handleRunStarted->>ctxRunSpans: store run span and context
  handleRunStarted->>ctxSessionRunRoots: record session root mapping

  handleSessionCreated->>resolveSessionTraceContext: resolve parent context for the session span
  resolveSessionTraceContext->>ctxRunSpans: read run span or run span context
  resolveSessionTraceContext->>ctxSessionRunRoots: look up the run root
  resolveSessionTraceContext-->>handleSessionCreated: parent context

  handleSessionIdle->>ctxRunSpans: look up the run span and end it with OK
  handleSessionError->>ctxRunSpans: look up the run span and end it with ERROR
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly states the main tracing change: root traces are anchored to chat.message runs.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/run-scoped-traces

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tests/handlers/spans.test.ts (1)

222-226: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Assert the fallback span is actually root-parented.

This test name promises root fallback, but it only verifies that one span is created. Add parent assertions so regressions in resolveSessionTraceContext() are caught.

🧪 Proposed assertion
     expect(() => handleSessionCreated(makeSessionCreated("ses_child", 1000, "ses_missing_parent"), ctx)).not.toThrow()
     expect(tracer.spans).toHaveLength(1)
+    expect(tracer.spans[0]!.parentSpan).toBeUndefined()
+    expect(tracer.spans[0]!.parentSpanContext).toBeUndefined()
   })
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/handlers/spans.test.ts` around lines 222 - 226, The test around
handleSessionCreated currently only checks that a span is created, but it does
not verify that the fallback span is actually root-parented. Update this test to
assert the created span has no parent / is rooted when
resolveSessionTraceContext() cannot find the parent run, using the existing
makeCtx(), handleSessionCreated(), and tracer.spans helpers to catch regressions
in parent resolution.
tests/helpers.ts (1)

84-122: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Make spy span contexts stable and unique.

The new parentSpanContext assertions are weakened because every SpySpan.spanContext() returns the same hard-coded span ID. A resolver using the wrong stored ended-span context can still pass these tests; allocate a stable unique context per spy span instead.

🧪 Proposed test-helper refinement
 function makeSpan(
   name: string,
   startTime?: number,
   parentSpan?: SpySpan,
   parentSpanContext?: SpanContext,
+  ownSpanContext?: SpanContext,
 ): SpySpan {
+  const context = ownSpanContext ?? {
+    traceId: "00000000000000000000000000000001",
+    spanId: "0000000000000001",
+    traceFlags: 1,
+  }
   const span: SpySpan = {
     name,
@@
-    spanContext() { return { traceId: "00000000000000000000000000000001", spanId: "0000000000000001", traceFlags: 1 } },
+    spanContext() { return context },
@@
 export function makeTracer(): SpyTracer {
+  let nextSpanID = 1
   const tracer: SpyTracer = {
     spans: [],
     startSpan(name, options, ctx) {
       const parentFromCtx = ctx ? trace.getSpan(ctx) as SpySpan | undefined : undefined
       const parentSpanContext = ctx ? trace.getSpanContext(ctx) ?? undefined : undefined
+      const ownSpanContext: SpanContext = {
+        traceId: parentSpanContext?.traceId ?? "00000000000000000000000000000001",
+        spanId: (nextSpanID++).toString(16).padStart(16, "0"),
+        traceFlags: parentSpanContext?.traceFlags ?? 1,
+        ...(parentSpanContext?.traceState ? { traceState: parentSpanContext.traceState } : {}),
+      }
       const span = makeSpan(
         name,
         typeof options?.startTime === "number" ? options.startTime : undefined,
         parentFromCtx,
         parentSpanContext,
+        ownSpanContext,
       )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/helpers.ts` around lines 84 - 122, The SpySpan context handling in
makeSpan is too static because spanContext() always returns the same hard-coded
IDs, which weakens parentSpanContext assertions in tests. Update makeSpan and
the SpyTracer path so each created spy span gets a stable, unique span context
value that is preserved per span instance, then have spanContext() return that
stored context instead of a shared constant. Keep the existing
makeTracer/startSpan flow intact, but ensure parentSpanContext comparisons can
distinguish the correct ended span from an incorrect one.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/handlers/session.ts`:
- Around line 75-76: The bounded map writes in session handling are evicting
active spans without ending them, which can drop export/status and break parent
continuity. Update the session span storage flow around setBoundedMap in session
handling so active spans in ctx.runSpans and ctx.sessionSpans are not silently
deleted; either use a span-aware bounded helper that ends evicted spans before
removal, or limit bounded storage to retained span contexts like
ctx.runSpanContexts. Make the same fix wherever the same pattern appears in the
session lifecycle methods that handle idle/error completion.

In `@src/util.ts`:
- Around line 26-37: The trace context resolution in resolveSessionTraceContext
is vulnerable to re-parenting when a new run starts before late events from the
previous run are finished. Update the session/run context lifecycle in the
session handlers so handleSessionIdle also clears the stale runSpanContexts
entry for the session (or otherwise preserves the old run context until late
events are done), and add a test for the Run1 Start → Run2 Start → Late Child
(Run1) ordering to verify late events keep the original parent context instead
of attaching to the new run.

---

Nitpick comments:
In `@tests/handlers/spans.test.ts`:
- Around line 222-226: The test around handleSessionCreated currently only
checks that a span is created, but it does not verify that the fallback span is
actually root-parented. Update this test to assert the created span has no
parent / is rooted when resolveSessionTraceContext() cannot find the parent run,
using the existing makeCtx(), handleSessionCreated(), and tracer.spans helpers
to catch regressions in parent resolution.

In `@tests/helpers.ts`:
- Around line 84-122: The SpySpan context handling in makeSpan is too static
because spanContext() always returns the same hard-coded IDs, which weakens
parentSpanContext assertions in tests. Update makeSpan and the SpyTracer path so
each created spy span gets a stable, unique span context value that is preserved
per span instance, then have spanContext() return that stored context instead of
a shared constant. Keep the existing makeTracer/startSpan flow intact, but
ensure parentSpanContext comparisons can distinguish the correct ended span from
an incorrect one.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4d52e280-aa4d-4187-b4e8-670e9ebd066f

📥 Commits

Reviewing files that changed from the base of the PR and between c9fee7f and 8e8e5ca.

📒 Files selected for processing (7)
  • src/handlers/message.ts
  • src/handlers/session.ts
  • src/index.ts
  • src/types.ts
  • src/util.ts
  • tests/handlers/spans.test.ts
  • tests/helpers.ts

Comment thread src/handlers/session.ts Outdated
Comment thread src/util.ts Outdated
@dialupdisaster dialupdisaster changed the title fix(tracing): anchor root traces to chat.message runs fix(tracing): key root runs by user message id Jun 24, 2026
@dialupdisaster dialupdisaster merged commit 3aa91fd into main Jun 24, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant