[Feature Request] GetSessionTrace API — structured per-session trace (model calls + tool args/results + timing) without cross-source joins

## Ask

AgentCore Runtime should expose a first-class **session trace API** — e.g. `GetSessionTrace(sessionId, runtimeArn)` — that returns the ordered, structured trace of an agent session: every model call (prompt, completion, token counts, latency), every tool call (name, full args, full result, duration, status), and lifecycle events. As structured JSON. Without requiring opt-in to Memory, without parsing runtime stdout, and without merging multiple log groups.

## Why

I'm building a per-session observability UI for a fleet of Strands-based AgentCore Runtime agents. To render a timeline showing what tools the agent called, what args it passed, and what each tool returned, I have to combine two CloudWatch data sources via substring matching on session ID. The data already exists in the platform; customers shouldn't have to glue it together.

## What's available today and why none of it works alone

| Source | Has | Doesn't have |
|---|---|---|
| `aws/spans` (Transaction Search OTEL log group) | Span name, kind, duration, tool name, tool status, `gen_ai.tool.call.id` | Tool args, tool result, model prompt/response — Transaction Search truncates/drops large attribute payloads |
| Runtime stdout `/aws/bedrock-agentcore/runtimes/<runtime>-<id>-DEFAULT` | Full conversation: tool args + results, model I/O, embedded as JSON-strings under `body.input.messages[].content` | No span/timing structure |
| AgentCore Memory `list_events` | Conversation turns *if* the agent calls `memory_create_event` | Tool args/results unless the agent explicitly logs them as events |
| Bedrock model invocation logs | Prompt + completion per call, includes `tool_use` blocks | `tool_result` blocks come in the *next* invocation's prompt; per-invocation, not per-session |

## Concrete repro that the data isn't in spans

```
fields name, attributes.gen_ai.tool.name, attributes.gen_ai.tool.call.id,
       attributes.gen_ai.tool.arguments, attributes.gen_ai.tool.response
| filter name like "execute_tool"
| limit 5
```

`tool.name` and `tool.call.id` are populated. `tool.arguments` and `tool.response` are not — even though Strands emits them, Transaction Search drops them. So spans alone cannot answer "what did this tool call do."

## What I have to do today

1. Query `aws/spans` filtered on `attributes.session.id` for span structure (name, duration, toolCallId).
2. Query the runtime log group for the same session, filter for `tool_call`, parse a JSON-string nested inside another JSON object (`body.input.messages[].content` decodes to `[{role, parts:[{type:"tool_call"|"tool_call_response", id, arguments|response}]}]`).
3. Join the two on `toolCallId`.

This is fragile in specific, reproducible ways:

- **Undocumented payload shape that changes under you.** Setting `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` silently changed runtime stdout shape: `toolUse`/`toolResult` blocks in `body.output.messages[].content[]` became `tool_call`/`tool_call_response` parts in `body.input.messages[].content` (now a JSON-encoded string instead of a structured array). My parser returned zero matches until rewritten. Neither shape is documented.
- **Undocumented log group naming.** `/aws/bedrock-agentcore/runtimes/<runtime>-<suffix>-DEFAULT` is a contract I discover at runtime via `DescribeLogGroups`.
- **No foreign key.** Session ID matching across log groups is substring-based — the only link is a string the agent has to remember to embed in both places.
- **Account-wide opt-in.** Transaction Search must be enabled at the account level. Any customer without it gets an empty timeline.

## Why the obvious workarounds don't solve it

- **"Use AgentCore Memory `list_events`."** Memory is a separate product the agent has to opt into and explicitly write events to. Strands tool calls aren't auto-persisted. Forcing every team that wants observability to adopt Memory is a strange coupling.
- **"Use Bedrock model invocation logging."** Captures the model's outgoing `tool_use` blocks. The `tool_result` block from the actual tool only appears in the *next* invocation's prompt, embedded in messages history. You'd have to scan every invocation to reassemble one tool round-trip — tool name in one entry, result in the next. Not a session-shaped API.
- **"Export OTEL to your own backend with larger attribute limits."** Sidesteps the indexer truncation but pushes the merge-and-parse problem onto every customer's infra. Defeats the purpose of Transaction Search.

## What "right" looks like

```json
{
  "sessionId": "...",
  "runtimeArn": "...",
  "events": [
    {
      "type": "model_call",
      "timestamp": "...",
      "durationMs": 2800,
      "modelId": "...",
      "promptTokens": 1234,
      "completionTokens": 567,
      "messages": [...]
    },
    {
      "type": "tool_call",
      "timestamp": "...",
      "durationMs": 58,
      "toolName": "load_blueprint",
      "toolCallId": "tooluse_...",
      "args": { "blueprint_name": "requirements-analyst" },
      "result": "...",
      "status": "success"
    },
    { "type": "model_call", "...": "..." }
  ]
}
```

The platform already has this data — model logs have prompt/completion, runtime stdout has tool I/O, OTEL has timing. Customers shouldn't be gluing it together with substring matches against undocumented log shapes.

## Environment

- Region: us-east-1
- Runtime: AgentCore Runtime (Strands-based agents)
- Telemetry: OTEL → CloudWatch Transaction Search
- Affected agent flag: `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental`


Source	Has	Doesn't have
`aws/spans` (Transaction Search OTEL log group)	Span name, kind, duration, tool name, tool status, `gen_ai.tool.call.id`	Tool args, tool result, model prompt/response — Transaction Search truncates/drops large attribute payloads
Runtime stdout `/aws/bedrock-agentcore/runtimes/<runtime>-<id>-DEFAULT`	Full conversation: tool args + results, model I/O, embedded as JSON-strings under `body.input.messages[].content`	No span/timing structure
AgentCore Memory `list_events`	Conversation turns if the agent calls `memory_create_event`	Tool args/results unless the agent explicitly logs them as events
Bedrock model invocation logs	Prompt + completion per call, includes `tool_use` blocks	`tool_result` blocks come in the next invocation's prompt; per-invocation, not per-session

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] GetSessionTrace API — structured per-session trace (model calls + tool args/results + timing) without cross-source joins #1426

Ask

Why

What's available today and why none of it works alone

Concrete repro that the data isn't in spans

What I have to do today

Why the obvious workarounds don't solve it

What "right" looks like

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature Request] GetSessionTrace API — structured per-session trace (model calls + tool args/results + timing) without cross-source joins #1426

Description

Ask

Why

What's available today and why none of it works alone

Concrete repro that the data isn't in spans

What I have to do today

Why the obvious workarounds don't solve it

What "right" looks like

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions