Describe the bug
The Gemini chat client only surfaces input, output, and total token counts in usage_details. Gemini's GenerateContentResponseUsageMetadata also reports cached_content_token_count (tokens served from context cache) and thoughts_token_count (tokens spent on thinking by reasoning models), but _parse_usage drops both. So for cached prompts and thinking models, cache and reasoning usage silently read as zero, which throws off cost and token accounting.
UsageDetails already has canonical fields for these (cache_read_input_token_count, reasoning_output_token_count), and the OpenAI and Anthropic connectors already populate them — Gemini is the odd one out.
Where
python/packages/gemini/agent_framework_gemini/_chat_client.py, RawGeminiChatClient._parse_usage.
Expected behavior
When the API returns cached_content_token_count / thoughts_token_count, map them to cache_read_input_token_count / reasoning_output_token_count in usage_details, matching the OpenAI and Anthropic connectors.
Describe the bug
The Gemini chat client only surfaces
input,output, andtotaltoken counts inusage_details. Gemini'sGenerateContentResponseUsageMetadataalso reportscached_content_token_count(tokens served from context cache) andthoughts_token_count(tokens spent on thinking by reasoning models), but_parse_usagedrops both. So for cached prompts and thinking models, cache and reasoning usage silently read as zero, which throws off cost and token accounting.UsageDetailsalready has canonical fields for these (cache_read_input_token_count,reasoning_output_token_count), and the OpenAI and Anthropic connectors already populate them — Gemini is the odd one out.Where
python/packages/gemini/agent_framework_gemini/_chat_client.py,RawGeminiChatClient._parse_usage.Expected behavior
When the API returns
cached_content_token_count/thoughts_token_count, map them tocache_read_input_token_count/reasoning_output_token_countinusage_details, matching the OpenAI and Anthropic connectors.