Skip to content

fix(live): yield tool_call events immediately to prevent Gemini 3.1 d…#5359

Open
imazizongit wants to merge 1 commit intogoogle:mainfrom
imazizongit:fix-3.1-tool-call-deadlock
Open

fix(live): yield tool_call events immediately to prevent Gemini 3.1 d…#5359
imazizongit wants to merge 1 commit intogoogle:mainfrom
imazizongit:fix-3.1-tool-call-deadlock

Conversation

@imazizongit
Copy link
Copy Markdown

…eadlock

GeminiLlmConnection.receive() accumulated tool_call messages in a tool_call_parts buffer and only yielded them when server_content.turn_complete arrived. This pattern deadlocks run_live() on gemini-3.1-flash-live-preview, which does NOT send turn_complete until AFTER it receives the tool response:

ADK waits for turn_complete → 3.1 waits for tool_response →
tool never dispatched → WebSocket times out (1000 close) ~15s later

This was the pre-1.28 behavior; the accumulation pattern was introduced recently and broke 3.1 Live compatibility.

Fix: yield tool_call messages as LlmResponse events immediately, one per incoming message. 2.5 and other models are unaffected — the flow layer handles yielded function_call content identically regardless of whether it arrives before or after turn_complete.

Confirmed in production: after this fix, audio returns ~1150ms after tool response on 3.1 (matching the direct SDK timing). Without the fix, every tool call hangs until the WebSocket times out.

Tests updated:

  • test_receive_multiple_tool_call_messages_yielded_immediately (renamed from ...buffered_until_turn_complete) — asserts each tool_call yields its own LlmResponse
  • test_receive_tool_call_and_grounding_metadata_with_native_audio — updated expected response ordering (tool_call now yields before the subsequent audio/grounding message instead of after it)
  • test_receive_tool_call_yielded_without_turn_complete (new) — regression test for the 3.1 deadlock: asserts tool_call yields even when no turn_complete is ever sent

Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

  • Closes: #issue_number
  • Related: #issue_number

2. Or, if no issue exists, describe the change:

If applicable, please follow the issue templates to provide as much detail as
possible.

Problem:
A clear and concise description of what the problem is.

Solution:
A clear and concise description of what you want to happen and why you choose
this solution.

Testing Plan

Please describe the tests that you ran to verify your changes. This is required
for all PRs that are not small documentation or typo fixes.

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

Please include a summary of passed pytest results.

Manual End-to-End (E2E) Tests:

Please provide instructions on how to manually test your changes, including any
necessary setup or configuration. Please provide logs or screenshots to help
reviewers better understand the fix.

Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have manually tested my changes end-to-end.
  • Any dependent changes have been merged and published in downstream modules.

Additional context

Add any other context or screenshots about the feature request here.

…eadlock

GeminiLlmConnection.receive() accumulated tool_call messages in a
tool_call_parts buffer and only yielded them when server_content.turn_complete
arrived. This pattern deadlocks run_live() on gemini-3.1-flash-live-preview,
which does NOT send turn_complete until AFTER it receives the tool response:

  ADK waits for turn_complete → 3.1 waits for tool_response →
  tool never dispatched → WebSocket times out (1000 close) ~15s later

This was the pre-1.28 behavior; the accumulation pattern was introduced
recently and broke 3.1 Live compatibility.

Fix: yield tool_call messages as LlmResponse events immediately, one per
incoming message. 2.5 and other models are unaffected — the flow layer
handles yielded function_call content identically regardless of whether
it arrives before or after turn_complete.

Confirmed in production: after this fix, audio returns ~1150ms after
tool response on 3.1 (matching the direct SDK timing). Without the fix,
every tool call hangs until the WebSocket times out.

Tests updated:
- test_receive_multiple_tool_call_messages_yielded_immediately (renamed
  from ...buffered_until_turn_complete) — asserts each tool_call yields
  its own LlmResponse
- test_receive_tool_call_and_grounding_metadata_with_native_audio —
  updated expected response ordering (tool_call now yields before the
  subsequent audio/grounding message instead of after it)
- test_receive_tool_call_yielded_without_turn_complete (new) — regression
  test for the 3.1 deadlock: asserts tool_call yields even when no
  turn_complete is ever sent
@adk-bot adk-bot added the live [Component] This issue is related to live, voice and video chat label Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

live [Component] This issue is related to live, voice and video chat

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants