Skip to content

[Feature Request] GetSessionTrace API — structured per-session trace (model calls + tool args/results + timing) without cross-source joins #1426

@tycenjmccann

Description

@tycenjmccann

Ask

AgentCore Runtime should expose a first-class session trace API — e.g. GetSessionTrace(sessionId, runtimeArn) — that returns the ordered, structured trace of an agent session: every model call (prompt, completion, token counts, latency), every tool call (name, full args, full result, duration, status), and lifecycle events. As structured JSON. Without requiring opt-in to Memory, without parsing runtime stdout, and without merging multiple log groups.

Why

I'm building a per-session observability UI for a fleet of Strands-based AgentCore Runtime agents. To render a timeline showing what tools the agent called, what args it passed, and what each tool returned, I have to combine two CloudWatch data sources via substring matching on session ID. The data already exists in the platform; customers shouldn't have to glue it together.

What's available today and why none of it works alone

Source Has Doesn't have
aws/spans (Transaction Search OTEL log group) Span name, kind, duration, tool name, tool status, gen_ai.tool.call.id Tool args, tool result, model prompt/response — Transaction Search truncates/drops large attribute payloads
Runtime stdout /aws/bedrock-agentcore/runtimes/<runtime>-<id>-DEFAULT Full conversation: tool args + results, model I/O, embedded as JSON-strings under body.input.messages[].content No span/timing structure
AgentCore Memory list_events Conversation turns if the agent calls memory_create_event Tool args/results unless the agent explicitly logs them as events
Bedrock model invocation logs Prompt + completion per call, includes tool_use blocks tool_result blocks come in the next invocation's prompt; per-invocation, not per-session

Concrete repro that the data isn't in spans

fields name, attributes.gen_ai.tool.name, attributes.gen_ai.tool.call.id,
       attributes.gen_ai.tool.arguments, attributes.gen_ai.tool.response
| filter name like "execute_tool"
| limit 5

tool.name and tool.call.id are populated. tool.arguments and tool.response are not — even though Strands emits them, Transaction Search drops them. So spans alone cannot answer "what did this tool call do."

What I have to do today

  1. Query aws/spans filtered on attributes.session.id for span structure (name, duration, toolCallId).
  2. Query the runtime log group for the same session, filter for tool_call, parse a JSON-string nested inside another JSON object (body.input.messages[].content decodes to [{role, parts:[{type:"tool_call"|"tool_call_response", id, arguments|response}]}]).
  3. Join the two on toolCallId.

This is fragile in specific, reproducible ways:

  • Undocumented payload shape that changes under you. Setting OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental silently changed runtime stdout shape: toolUse/toolResult blocks in body.output.messages[].content[] became tool_call/tool_call_response parts in body.input.messages[].content (now a JSON-encoded string instead of a structured array). My parser returned zero matches until rewritten. Neither shape is documented.
  • Undocumented log group naming. /aws/bedrock-agentcore/runtimes/<runtime>-<suffix>-DEFAULT is a contract I discover at runtime via DescribeLogGroups.
  • No foreign key. Session ID matching across log groups is substring-based — the only link is a string the agent has to remember to embed in both places.
  • Account-wide opt-in. Transaction Search must be enabled at the account level. Any customer without it gets an empty timeline.

Why the obvious workarounds don't solve it

  • "Use AgentCore Memory list_events." Memory is a separate product the agent has to opt into and explicitly write events to. Strands tool calls aren't auto-persisted. Forcing every team that wants observability to adopt Memory is a strange coupling.
  • "Use Bedrock model invocation logging." Captures the model's outgoing tool_use blocks. The tool_result block from the actual tool only appears in the next invocation's prompt, embedded in messages history. You'd have to scan every invocation to reassemble one tool round-trip — tool name in one entry, result in the next. Not a session-shaped API.
  • "Export OTEL to your own backend with larger attribute limits." Sidesteps the indexer truncation but pushes the merge-and-parse problem onto every customer's infra. Defeats the purpose of Transaction Search.

What "right" looks like

{
  "sessionId": "...",
  "runtimeArn": "...",
  "events": [
    {
      "type": "model_call",
      "timestamp": "...",
      "durationMs": 2800,
      "modelId": "...",
      "promptTokens": 1234,
      "completionTokens": 567,
      "messages": [...]
    },
    {
      "type": "tool_call",
      "timestamp": "...",
      "durationMs": 58,
      "toolName": "load_blueprint",
      "toolCallId": "tooluse_...",
      "args": { "blueprint_name": "requirements-analyst" },
      "result": "...",
      "status": "success"
    },
    { "type": "model_call", "...": "..." }
  ]
}

The platform already has this data — model logs have prompt/completion, runtime stdout has tool I/O, OTEL has timing. Customers shouldn't be gluing it together with substring matches against undocumented log shapes.

Environment

  • Region: us-east-1
  • Runtime: AgentCore Runtime (Strands-based agents)
  • Telemetry: OTEL → CloudWatch Transaction Search
  • Affected agent flag: OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions