-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Is your feature request related to a specific problem?
It's clear that vertex AI related features are having a specific price . But there are hidden costs when it comes to data passing from these services to an LLM agent.
I am unable to accurately track the total token usage and associated costs for complex agentic workflows. While usage_metadata captures the context for the root agent, it misses the granular interaction between sub-agents (implemented as agent-tools).
Furthermore, when attempting to track Vertex AI-specific features, the current implementation returns a boolean dictionary indicating success or failure rather than a detailed breakdown of token consumption. This makes "real cost" calculation impossible for production environments.
Describe the Solution You'd Like
I would like an enhancement to the Vertex AI integration within the ADK that exposes detailed token usage metrics (prompt tokens, completion tokens, and total tokens) even for sub-agent interactions and specific Vertex AI features.
Specifically:
Ensure that the response object for Vertex AI features includes a usage_metadata or token_count object instead of a simple boolean state.
Provide a standardized way to bubble up these metrics from tool-level agents to a centralized tracker.
Impact on your work
This is critical for budget monitoring and ROI analysis of our AI agents. Without this, we are essentially "flying blind" regarding the operational costs of our sub-agent architecture. I have already developed a local workaround for external APIs, but the Vertex AI "boolean response" remains a blocker.
Willingness to contribute
Yes, I have already developed a feature to track tokens for standard agent-tools and would be interested in submitting a PR to extend this to Vertex AI features.
🟡 Recommended Information
Describe Alternatives You've Considered
Manual Estimation: Calculating costs based on character counts, which is inaccurate due to varying tokenization logic.
Vertex AI Quota Logs: Checking the Google Cloud Console logs, which is not real-time and cannot be easily mapped back to specific agent sessions in code.
Proposed API / Implementation
Ideally, the response from a Vertex AI tool call should look like this:
Python
Current state:
response = {"success": True}
Proposed state:
response = {
"success": True,
"usage": {
"total_tokens": 1536
},
"data": "..."
}
Additional Context
The issue is most prominent when sub-agents act as agenttools. Because these are treated as external calls, the root agent's metadata doesn't naturally aggregate their internal costs, particularly when those tools interface with Vertex AI specific endpoints.