Skip to content

relevance_trigger_agent_sync: MCP transport drops mid-poll, surfaces as misleading "user rejected the tool use" error in Claude Code #10

@charliebirch

Description

@charliebirch

Summary

When calling relevance_trigger_agent_sync over the MCP bridge from Claude Code, the call consistently fails with Claude Code reporting "The user doesn't want to proceed with this tool use. The tool use was rejected..." — but the user never rejected anything. The platform task actually runs and completes successfully; the MCP response just never makes it back to the caller.

Hit this 3 times in a single session during legitimate agent smoke-tests.

Environment

  • Claude Code (desktop app, macOS, Darwin 25.4.0)
  • cc-plugin MCP server name: relevance-ai-scotpac (OAuth, https://mcp.relevanceai.com)
  • Region: f1db6c
  • Model: claude-opus-4-6 on the Claude Code side; agent runs claude-sonnet-4-6 on Relevance
  • Tool: mcp__relevance-ai-scotpac__relevance_trigger_agent_sync

Repro

  1. Authenticate the relevance-ai-scotpac MCP (/mcp)
  2. From Claude Code, call mcp__relevance-ai-scotpac__relevance_trigger_agent_sync with any agent_id + message that will take 30-90s to complete (agent with ~5 tools + thinking + knowledge search is plenty)
  3. Observe: Claude Code immediately returns a tool-use-rejected error
  4. Observe: mcp__relevance-ai-scotpac__relevance_list_agent_tasks shortly afterwards shows the task as completed on the platform, with a valid agent response

Expected behaviour

trigger_agent_sync either:

  • Returns the agent's final response when the task completes within the 120s window, OR
  • Returns timed_out status cleanly if the task exceeds 120s, OR
  • Returns a clear transport/auth error if the MCP connection drops mid-call

Actual behaviour

Claude Code surfaces "The user doesn't want to proceed with this tool use. The tool use was rejected (eg. if it was a file edit, the new_string was NOT written to the file). STOP what you are doing and wait for the user to tell you how to proceed." — a generic permission-rejection error, not reflective of what actually happened.

Running /mcp afterwards shows the relevance-ai-scotpac MCP server as disconnected, confirming the transport dropped during the long poll.

Suspected root cause

trigger_agent_sync holds the MCP connection open for up to 120 seconds while polling for task completion. During that window:

  • OAuth token may expire (observed once after a /login flow earlier in the session — token had clearly rotated)
  • Idle timeout on the MCP HTTP bridge (likely)
  • WebSocket / SSE connection drop not gracefully surfaced as a tool result

Claude Code sees the transport close mid-call and defaults to its generic rejection error, which makes diagnosis needlessly hard — I initially RCA'd this as the user dismissing a permission prompt, which was wrong.

Workaround (what I'm doing instead)

Using the async pattern everywhere:

relevance_trigger_agent         // returns conversation_id in <5s
relevance_list_agent_tasks      // poll separately to find task_id
relevance_get_agent_task_summary  // retrieve final response once completed

This keeps each MCP call short (seconds, not minutes), so the transport drop is never in play. Every task I've run via the async pattern has returned cleanly.

Requests

Prioritised:

  1. Document the limitation in plugins/relevance-ai/skills/managing-relevance-agents/running.md — add a note near the existing 120s timeout row in the Troubleshooting table warning that trigger_agent_sync over MCP can surface as a generic Claude Code rejection if the transport drops, and recommend the async pattern for tasks expected to run >30s. This is the cheapest win.

  2. Surface a clearer error from the MCP layer when the transport drops during _sync — return a structured error like transport_dropped_during_poll (or similar) instead of letting Claude Code fall back to its generic rejection message. Even a string prefix like "MCP transport dropped:" in the error message would save a lot of debugging.

  3. Consider deprecating trigger_agent_sync for the MCP surface and pushing async-then-poll as the canonical pattern in docs. The sync tool's value (one call, one response) is real but the failure mode as it stands is bad UX — the first time this happens, it looks like a user-permission problem, not a transport problem.

Additional context

  • Reported from a ScotPac SE build session where this blocked 3 smoke-test attempts over ~20 minutes before I figured out what was actually happening.
  • Happy to attach trace evidence from list_agent_tasks showing the tasks completed while Claude Code was reporting rejection, if useful.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions