Skip to content

🚨 Production Impact: Repeated 429 Resource exhausted Errors Despite Being Within Quota (Paid Tier 1) #2001

@wo-yashsinghvi

Description

@wo-yashsinghvi

Hi Team,

We are facing persistent 429 Resource exhausted errors while calling the Gemini model via Vertex AI, and this is impacting a production application.

❗ Error Observed

429 Resource exhausted. Please try again later.
Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429

🔍 Context

  • Model: Gemini Flash 2.0
  • Environment: Production
  • Billing: Paid Tier 1 (active billing account)
  • Usage: Well within the configured quota limits
  • Integration: Using langchain-google package with Gemini (Vertex AI)

To be very clear:
➡️ This does not appear to be a LangChain issue.
➡️ The error is coming directly from Vertex AI / Gemini.

🚨 Business Impact

  • This is a live production system
  • End customers are being affected
  • Requests are failing intermittently without any quota breach
  • There is no clear signal why the resource is considered “exhausted”

❓ What We Need Help With

  • Why are 429 Resource exhausted errors occurring even when usage is within quota?
  • Are there any hidden limits, regional constraints, or backend throttling mechanisms for Gemini Flash 2.0?
  • Is there an ongoing issue or degradation with the service?

If you need additional logs, project details, request IDs, or configuration information, please let us know—we’re happy to provide it immediately.

This issue is blocking business operations, so an urgent review and fix would be greatly appreciated.

Metadata

Metadata

Assignees

Labels

status:awaiting user responsetype: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions