🔴 Required Information
Describe the Bug:
ADK's OpenTelemetry tracing does not export thoughts_token_count to Cloud Trace span attributes. When using Gemini models with ThinkingConfig, the usage_metadata in LlmResponse correctly contains thoughts_token_count (verified via Event.usage_metadata), but this field is never written to the OpenTelemetry span. Only gen_ai.usage.input_tokens (from prompt_token_count) and gen_ai.usage.output_tokens (from candidates_token_count) are exported.
This makes it impossible to monitor or analyze thinking token consumption via Cloud Trace, Cloud Monitoring, or any observability pipeline that relies on span attributes.
The gap is in two locations in google/adk/telemetry/tracing.py:
trace_call_llm() (line ~329-339) — exports gen_ai.usage.input_tokens and gen_ai.usage.output_tokens but not thinking tokens
trace_generate_content_result() (line ~591-599) — same: only GEN_AI_USAGE_INPUT_TOKENS and GEN_AI_USAGE_OUTPUT_TOKENS
Steps to Reproduce:
- Install
google-adk==1.26.0 and google-genai>=1.65.0
- Create an agent with
ThinkingConfig(thinking_level="high") and enable Cloud Trace export
- Run a query and inspect the resulting Cloud Trace spans
- Observe that
thoughts_token_count / thinking_token_count is absent from span attributes
- For comparison, inspect
Event.usage_metadata.thoughts_token_count directly — it is non-zero
Expected Behavior:
When usage_metadata.thoughts_token_count is non-None and non-zero in the model response, ADK should export it as a span attribute (e.g., gen_ai.usage.thinking_tokens or similar) alongside the existing gen_ai.usage.input_tokens and gen_ai.usage.output_tokens.
Observed Behavior:
thoughts_token_count is present in Event.usage_metadata (verified programmatically) but is never exported to OpenTelemetry spans. Cloud Trace shows gen_ai.usage.input_tokens and gen_ai.usage.output_tokens but no thinking token attribute.
Environment Details:
- ADK Library Version:
google-adk==1.26.0
- Desktop OS: macOS
- Python Version: 3.11.7
Model Information:
- Are you using LiteLLM: No
- Which model is being used:
gemini-2.5-flash (also reproducible with other Gemini models that support thinking)
🟡 Optional Information
Regression:
Unknown — this appears to have never been implemented rather than being a regression.
Logs:
# Event.usage_metadata shows thinking tokens correctly:
Event author=weather_agent: prompt=53 candidates=6 thoughts=50 total=109
Event author=weather_agent: prompt=121 candidates=17 thoughts=None total=138
# But Cloud Trace span only has:
# gen_ai.usage.input_tokens = 53
# gen_ai.usage.output_tokens = 6
# No thinking_token attribute exists on the span.
Additional Context:
The root cause is visible in google/adk/telemetry/tracing.py. Both trace_call_llm() and trace_generate_content_result() extract and export only two token metrics from usage_metadata:
# In trace_call_llm() (~line 329):
if llm_response.usage_metadata is not None:
if llm_response.usage_metadata.prompt_token_count is not None:
span.set_attribute('gen_ai.usage.input_tokens', ...)
if llm_response.usage_metadata.candidates_token_count is not None:
span.set_attribute('gen_ai.usage.output_tokens', ...)
# thoughts_token_count is NOT exported here
A fix would add:
if llm_response.usage_metadata.thoughts_token_count is not None:
span.set_attribute('gen_ai.usage.thinking_tokens',
llm_response.usage_metadata.thoughts_token_count)
(The attribute name gen_ai.usage.thinking_tokens is a suggestion — the OpenTelemetry GenAI semantic conventions may not yet define a standard name for this, but a vendor-prefixed alternative like gcp.vertex.agent.usage.thinking_tokens would also work.)
The same addition is needed in trace_generate_content_result() (~line 591).
Minimal Reproduction Code:
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "google-adk>=1.26.0",
# "google-genai>=1.65.0",
# ]
# ///
"""Minimal reproduction: thoughts_token_count present in Event but missing from traces.
Run with:
export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT=<your-project>
uv run repro_thinking_trace.py
"""
import asyncio
from google.adk import Runner
from google.adk.agents import Agent
from google.adk.memory import InMemoryMemoryService
from google.adk.sessions import InMemorySessionService
from google.genai import types
def get_weather(city: str) -> dict:
"""Get current weather for a city."""
return {"city": city, "temp_f": 72, "condition": "sunny"}
async def main():
agent = Agent(
model="gemini-2.5-flash",
name="weather_agent",
instruction="You are a weather assistant. Always use the get_weather tool.",
tools=[get_weather],
generate_content_config=types.GenerateContentConfig(
thinking_config=types.ThinkingConfig(thinking_budget=2048),
),
disallow_transfer_to_parent=True,
disallow_transfer_to_peers=True,
)
session_service = InMemorySessionService()
runner = Runner(
agent=agent,
app_name="repro",
session_service=session_service,
memory_service=InMemoryMemoryService(),
)
session = await session_service.create_session(
user_id="test", app_name="repro"
)
message = types.Content(
parts=[types.Part(text="What's the weather in San Francisco?")]
)
async for event in runner.run_async(
new_message=message, user_id="test", session_id=session.id
):
usage = getattr(event, "usage_metadata", None)
if usage is not None:
thoughts = getattr(usage, "thoughts_token_count", None)
print(
f"Event author={event.author}: "
f"prompt={usage.prompt_token_count} "
f"candidates={usage.candidates_token_count} "
f"thoughts={thoughts} "
f"total={usage.total_token_count}"
)
if thoughts is not None and thoughts > 0:
print(
" ^^^ thoughts_token_count is non-zero in Event, "
"but will NOT appear in Cloud Trace span attributes"
)
if __name__ == "__main__":
asyncio.run(main())
How often has this issue occurred?:
- Always (100%) —
thoughts_token_count is never exported to spans.
🔴 Required Information
Describe the Bug:
ADK's OpenTelemetry tracing does not export
thoughts_token_countto Cloud Trace span attributes. When using Gemini models withThinkingConfig, theusage_metadatainLlmResponsecorrectly containsthoughts_token_count(verified viaEvent.usage_metadata), but this field is never written to the OpenTelemetry span. Onlygen_ai.usage.input_tokens(fromprompt_token_count) andgen_ai.usage.output_tokens(fromcandidates_token_count) are exported.This makes it impossible to monitor or analyze thinking token consumption via Cloud Trace, Cloud Monitoring, or any observability pipeline that relies on span attributes.
The gap is in two locations in
google/adk/telemetry/tracing.py:trace_call_llm()(line ~329-339) — exportsgen_ai.usage.input_tokensandgen_ai.usage.output_tokensbut not thinking tokenstrace_generate_content_result()(line ~591-599) — same: onlyGEN_AI_USAGE_INPUT_TOKENSandGEN_AI_USAGE_OUTPUT_TOKENSSteps to Reproduce:
google-adk==1.26.0andgoogle-genai>=1.65.0ThinkingConfig(thinking_level="high")and enable Cloud Trace exportthoughts_token_count/thinking_token_countis absent from span attributesEvent.usage_metadata.thoughts_token_countdirectly — it is non-zeroExpected Behavior:
When
usage_metadata.thoughts_token_countis non-None and non-zero in the model response, ADK should export it as a span attribute (e.g.,gen_ai.usage.thinking_tokensor similar) alongside the existinggen_ai.usage.input_tokensandgen_ai.usage.output_tokens.Observed Behavior:
thoughts_token_countis present inEvent.usage_metadata(verified programmatically) but is never exported to OpenTelemetry spans. Cloud Trace showsgen_ai.usage.input_tokensandgen_ai.usage.output_tokensbut no thinking token attribute.Environment Details:
google-adk==1.26.0Model Information:
gemini-2.5-flash(also reproducible with other Gemini models that support thinking)🟡 Optional Information
Regression:
Unknown — this appears to have never been implemented rather than being a regression.
Logs:
Additional Context:
The root cause is visible in
google/adk/telemetry/tracing.py. Bothtrace_call_llm()andtrace_generate_content_result()extract and export only two token metrics fromusage_metadata:A fix would add:
(The attribute name
gen_ai.usage.thinking_tokensis a suggestion — the OpenTelemetry GenAI semantic conventions may not yet define a standard name for this, but a vendor-prefixed alternative likegcp.vertex.agent.usage.thinking_tokenswould also work.)The same addition is needed in
trace_generate_content_result()(~line 591).Minimal Reproduction Code:
How often has this issue occurred?:
thoughts_token_countis never exported to spans.