OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental breaks all builtin online evaluators (AgentSpanMappingException)

## Summary

Setting `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` as an environment variable on AgentCore runtimes using Strands Agents SDK causes **all** builtin evaluators to fail with:

```
error.type: AgentSpanMappingException
error.message: Failed to parse user_query from agent-span with spanId: <id> and scope: strands.telemetry.tracer
```

## Environment

- **Region:** us-east-1
- **Runtimes affected:** 16 runtimes using Strands Agents SDK
- **ADOT auto-instrumentor:** `telemetry.auto.version: 0.17.1-aws`
- **Evaluators affected:** All builtin evaluators (Helpfulness, Correctness, GoalSuccessRate, Coherence, Faithfulness, InstructionFollowing, ToolSelectionAccuracy, ToolParameterAccuracy)

## Timeline

| Time | Event |
|---|---|
| 2026-05-29 12:40 | Last successful eval scores (ToolSelectionAccuracy = 1.0) |
| ~12:45–13:00 | `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` applied via `update_agent_runtime` |
| 2026-05-29 13:29 | First `AgentSpanMappingException` — 100% failure rate on all evaluators |

## Root Cause

The env var changes how Strands serializes message content in its OTEL log records (scope: `strands.telemetry.tracer`). The evaluator parses `user_query` from these log records but expects a specific format.

**Before (working)** — `body.input.messages[].content` is a **dict**:
```json
{
  "role": "user",
  "content": {
    "content": "[{\"text\": \"Your actual prompt here...\"}]"
  }
}
```

**After (broken)** — `body.input.messages[].content` is a **raw string** in GenAI semconv format:
```json
{
  "role": "user",
  "content": "[{\"role\": \"user\", \"parts\": [{\"type\": \"text\", \"content\": \"Your actual prompt here...\"}]}]"
}
```

The evaluator's span parser expects the dict-with-nested-`content`-key structure and throws `AgentSpanMappingException` when it encounters the flattened string format.

## Important Notes

1. **Strands SDK itself does NOT read this env var** — it hardcodes its own attribute names. However, the env var changes how Strands serializes message content within its log records.
2. **The `gen_ai.user.message` events from the ADOT botocore instrumentor are unaffected** — those remain identical in both modes. Only the Strands-emitted log records change.
3. **All evaluators fail identically** — the issue is in span parsing, not evaluation logic.

## Reproduction Steps

1. Create an AgentCore runtime with Strands Agents SDK
2. Configure online evaluations with any builtin evaluator
3. Invoke the agent — observe successful evaluation scores
4. Set `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` via `update_agent_runtime`
5. Invoke the agent again — observe `AgentSpanMappingException` on all evaluators

## Expected Behavior

AgentCore's builtin evaluators should either:
- Support both message content formats (dict wrapper and GenAI semconv string), OR
- Document that `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` is incompatible with online evaluations, OR
- The ADOT layer should not alter the Strands log record serialization format based on this env var

## Workaround

Remove `OTEL_SEMCONV_STABILITY_OPT_IN` from runtime environment variables entirely. Evaluations resume working immediately on the next agent invocation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental breaks all builtin online evaluators (AgentSpanMappingException) #1427

Summary

Environment

Timeline

Root Cause

Important Notes

Reproduction Steps

Expected Behavior

Workaround

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Time	Event
2026-05-29 12:40	Last successful eval scores (ToolSelectionAccuracy = 1.0)
~12:45–13:00	`OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` applied via `update_agent_runtime`
2026-05-29 13:29	First `AgentSpanMappingException` — 100% failure rate on all evaluators

OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental breaks all builtin online evaluators (AgentSpanMappingException) #1427

Description

Summary

Environment

Timeline

Root Cause

Important Notes

Reproduction Steps

Expected Behavior

Workaround

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions