relevance_trigger_agent_sync: MCP transport drops mid-poll, surfaces as misleading "user rejected the tool use" error in Claude Code

## Summary

When calling `relevance_trigger_agent_sync` over the MCP bridge from Claude Code, the call consistently fails with Claude Code reporting `"The user doesn't want to proceed with this tool use. The tool use was rejected..."` — but the user never rejected anything. The platform task actually runs and completes successfully; the MCP response just never makes it back to the caller.

Hit this 3 times in a single session during legitimate agent smoke-tests.

## Environment

- Claude Code (desktop app, macOS, Darwin 25.4.0)
- cc-plugin MCP server name: `relevance-ai-scotpac` (OAuth, `https://mcp.relevanceai.com`)
- Region: `f1db6c`
- Model: `claude-opus-4-6` on the Claude Code side; agent runs `claude-sonnet-4-6` on Relevance
- Tool: `mcp__relevance-ai-scotpac__relevance_trigger_agent_sync`

## Repro

1. Authenticate the `relevance-ai-scotpac` MCP (`/mcp`)
2. From Claude Code, call `mcp__relevance-ai-scotpac__relevance_trigger_agent_sync` with any agent_id + message that will take 30-90s to complete (agent with ~5 tools + thinking + knowledge search is plenty)
3. Observe: Claude Code immediately returns a tool-use-rejected error
4. Observe: `mcp__relevance-ai-scotpac__relevance_list_agent_tasks` shortly afterwards shows the task as `completed` on the platform, with a valid agent response

## Expected behaviour

`trigger_agent_sync` either:
- Returns the agent's final response when the task completes within the 120s window, OR
- Returns `timed_out` status cleanly if the task exceeds 120s, OR
- Returns a clear transport/auth error if the MCP connection drops mid-call

## Actual behaviour

Claude Code surfaces `"The user doesn't want to proceed with this tool use. The tool use was rejected (eg. if it was a file edit, the new_string was NOT written to the file). STOP what you are doing and wait for the user to tell you how to proceed."` — a generic permission-rejection error, not reflective of what actually happened.

Running `/mcp` afterwards shows the `relevance-ai-scotpac` MCP server as disconnected, confirming the transport dropped during the long poll.

## Suspected root cause

`trigger_agent_sync` holds the MCP connection open for up to 120 seconds while polling for task completion. During that window:

- OAuth token may expire (observed once after a `/login` flow earlier in the session — token had clearly rotated)
- Idle timeout on the MCP HTTP bridge (likely)
- WebSocket / SSE connection drop not gracefully surfaced as a tool result

Claude Code sees the transport close mid-call and defaults to its generic rejection error, which makes diagnosis needlessly hard — I initially RCA'd this as the user dismissing a permission prompt, which was wrong.

## Workaround (what I'm doing instead)

Using the async pattern everywhere:

```
relevance_trigger_agent         // returns conversation_id in <5s
relevance_list_agent_tasks      // poll separately to find task_id
relevance_get_agent_task_summary  // retrieve final response once completed
```

This keeps each MCP call short (seconds, not minutes), so the transport drop is never in play. Every task I've run via the async pattern has returned cleanly.

## Requests

Prioritised:

1. **Document the limitation** in `plugins/relevance-ai/skills/managing-relevance-agents/running.md` — add a note near the existing 120s timeout row in the Troubleshooting table warning that `trigger_agent_sync` over MCP can surface as a generic Claude Code rejection if the transport drops, and recommend the async pattern for tasks expected to run >30s. This is the cheapest win.

2. **Surface a clearer error from the MCP layer** when the transport drops during `_sync` — return a structured error like `transport_dropped_during_poll` (or similar) instead of letting Claude Code fall back to its generic rejection message. Even a string prefix like `"MCP transport dropped:"` in the error message would save a lot of debugging.

3. **Consider deprecating `trigger_agent_sync` for the MCP surface** and pushing async-then-poll as the canonical pattern in docs. The sync tool's value (one call, one response) is real but the failure mode as it stands is bad UX — the first time this happens, it looks like a user-permission problem, not a transport problem.

## Additional context

- Reported from a ScotPac SE build session where this blocked 3 smoke-test attempts over ~20 minutes before I figured out what was actually happening.
- Happy to attach trace evidence from `list_agent_tasks` showing the tasks completed while Claude Code was reporting rejection, if useful.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

relevance_trigger_agent_sync: MCP transport drops mid-poll, surfaces as misleading "user rejected the tool use" error in Claude Code #10

Summary

Environment

Repro

Expected behaviour

Actual behaviour

Suspected root cause

Workaround (what I'm doing instead)

Requests

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

relevance_trigger_agent_sync: MCP transport drops mid-poll, surfaces as misleading "user rejected the tool use" error in Claude Code #10

Description

Summary

Environment

Repro

Expected behaviour

Actual behaviour

Suspected root cause

Workaround (what I'm doing instead)

Requests

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions