fix(mcp): refresh tools on list_changed notification#5604
Open
shizhigu wants to merge 4 commits intolivekit:mainfrom
Open
fix(mcp): refresh tools on list_changed notification#5604shizhigu wants to merge 4 commits intolivekit:mainfrom
shizhigu wants to merge 4 commits intolivekit:mainfrom
Conversation
Robustness pass on the tool list_changed handling added in 93315ac: - MCPToolset reload now retries with exponential backoff (1s/2s/4s, 4 attempts total) before giving up, instead of failing on the first transient error and stranding the agent on a stale tool list. - On terminal failure, MCPServer fires a new _reload_failed callback that AgentActivity bridges to AgentSession's "error" event so operators can react. Previous behavior only logged. - _handle_message now passes Exception-typed messages through anyio.lowlevel.checkpoint() (matching the MCP SDK default handler) so cooperative cancellation propagates through bursts of notifications. - Extract _publish_tools_change helper shared between the public update_tools API and the background refresh task. Eliminates ~80% duplication and standardizes the fail-safe ordering (rt_session first, chat_ctx insert last). - Add a dedicated _refresh_lock so concurrent calls to update_tools and the toolset refresh task serialize their writes to _chat_ctx and rt_session.update_tools. Cannot reuse the main _lock because it is held across long operations like _close_session and would deadlock the cancel_and_wait of the refresh task. - Drop the per-toolset closure in _sync_toolset_change_subscriptions; the bound method self._request_toolset_refresh has the right signature. Tests: 5 new cases in tests/test_mcp.py cover retry success, retry exhaustion + callback fire, AgentSession error emission, refresh-lock serialization, and the cancellation pass-through. Existing test for "keeps existing tools when reload fails" updated for the new 4-attempt retry budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
10e0236 to
233c214
Compare
Second-pass review hardening on top of e4024b1: - ErrorEvent emitted for a terminal MCP reload now carries source=<failing MCPServer>, not source=self._agent. Operators with multiple MCP servers per agent can now identify which server went stale. Bridge uses functools.partial to bind the server identity at registration time, so MCPServer's callback signature is unchanged. - _publish_tools_change re-checks _closed after acquiring _refresh_lock. Without this, a publish queued behind the lock could be awoken by a cancelled refresh task and write to an rt_session that _close_session is about to tear down. - _handle_message hoists anyio.lowlevel.checkpoint() to fire for every message (matches the MCP SDK default handler). Previously only the Exception branch yielded, starving the receive loop of cancellation points during notification bursts. - _request_tools_reload bails early if the toolset has already unsubscribed from the server's callback set. Closes a snapshot- iteration race in MCPServer._handle_tool_list_changed where a callback fires after aclose() removed it. Tests: 4 new cases (source-is-server, skip-emit-after-close, skip- publish-after-close, checkpoint-on-notification, request-reload- after-unsubscribe) plus a snapshot-equality assertion on the existing lock-serialization test. 23/23 in tests/test_mcp.py pass; no regressions in tests/test_tools.py (37) or tests/test_agent_session.py (20). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #5378.
Why
MCP servers can advertise new or removed tools at runtime via
notifications/tools/list_changed. TodayMCPServerignores that notification, so an agent keeps using the stale tool list for the rest of the session — calls to removed tools fail and newly added tools are invisible until restart.What changed
MCPServerlistens fornotifications/tools/list_changed, invalidates its cachedlist_tools()result, and fans out to subscribed toolsets.MCPServer.list_tools()keeps the cache dirty if a notification arrives while a fetch is in flight, so direct callers refetch instead of treating an older snapshot as clean.MCPToolsetreloads tools in a background task: coalesces duplicate notifications, retries up to 4 attempts with 1s/2s/4s exponential backoff, preserves userfilter_toolspredicates across reloads, and unregisters on close.AgentActivitybridges toAgentSession.emit("error", ...), so operators can detect that the agent's tool view has gone stale.AgentActivitysubscribes toToolsetchanges, refreshes the realtime session and chat context under_refresh_lock, and inserts anAgentConfigUpdatedescribing the diff. The publicupdate_tools()path still records a no-opAgentConfigUpdatewhen the requested tool set is unchanged._close_sessioncancels refresh tasks and takes_refresh_lockbefore closing realtime resources, so an in-flight tool publish cannot race with realtime close._publish_tools_change, used by bothupdate_tools()and background refresh._handle_messagepassesException-typed messages throughanyio.lowlevel.checkpoint(), matching the MCP SDK default handler so cooperative cancellation propagates.No public API changes; new callbacks are internal plumbing.
Test plan
Automated
uv run --extra mcp pytest tests/test_mcp.py19 passeduv run ruff check livekit-agents/livekit/agents/voice/agent_activity.py livekit-agents/livekit/agents/llm/mcp.py tests/test_mcp.pymake checkuv run --extra mcp pytest tests/test_tools.py37 passedmake unit-tests626 passed, 2 skipped;tests/test_room.pysetup failed locally because this machine does not have thelivekit-serverbinary on PATH.New coverage in
tests/test_mcp.pyincludes:list_tools()does not get missedMCPServer.list_tools()keeps the cache dirtyupdate_tools()still records an auditAgentConfigUpdateupdate_tools()is a no-opException-typed MCP messages preserve cancellation behaviorEnd-to-end smoke against a real subprocess MCP server
Wrote a minimal MCP stdio server (using
mcp.server.lowlevel.Server+ServerSession.send_tool_list_changed) that mutates its tool list 1s after start, then droveMCPServerStdio+MCPToolsetagainst it in three modes (_RELOAD_RETRY_DELAYSpatched to 0/0.1/0.1 for speed):Notes
tests/test_mcp.pyis added to theunit-testsmake target so it runs in CI.AgentSession's error event._refresh_lockis separate fromself._lockbecause_close_sessionholdsself._lockwhile cancelling refresh tasks; using one lock for both would deadlock.ErrorEventis tagged withsource=<MCPServer>(not the agent) so listeners with multiple MCP servers can identify which one went stale._publish_tools_changere-checks_closedafter acquiring_refresh_lockso a publish queued behind the lock cannot wake into a closing realtime session._request_tools_reloadbails if the toolset has already unsubscribed, closing a snapshot-iteration race inMCPServer._handle_tool_list_changed.anyio.lowlevel.checkpoint()runs at the top of_handle_messagefor every message (not justException-typed), matching the MCP SDK default handler so notification bursts don't starve the receive loop of cancellation points.