Fix false "relay unreachable" error on agent creation#1433
Conversation
The New Agent dialog never pins a per-agent relay, so `create_managed_agent`'s
`resolved_relay_url` is empty. Phase 4 (profile sync) passed that empty string
straight to `sync_managed_agent_profile`, where `relay_http_base_url("")`
returns "" and the profile POST targets a schemeless, hostless `/events`. reqwest
cannot connect, and `classify_request_error` buckets the failure as
"relay unreachable: could not connect to relay" — surfaced to the user as a
false profile-sync error even though the workspace relay is fully reachable.
Every other managed-agent relay path (reconcile, rename) resolves the target via
`effective_agent_relay_url`, which falls back to the active workspace relay when
the per-agent value is blank. The create-time sync was the sole exception.
Fix: resolve the sync target through `effective_agent_relay_url` in Phase 4,
mirroring the reconcile and rename paths.
Hardening: `sync_managed_agent_profile` now rejects a blank/hostless relay URL
with a distinct, actionable error instead of the misleading "relay unreachable",
so any future caller that forgets to resolve the relay fails loudly.
Tests: add regression coverage that a UI-created agent (empty relay_url) resolves
the sync target to the reachable workspace host, and that a blank relay yields an
empty HTTP base (the trap the new guard catches).
Co-authored-by: Mathieu Balez <mbalez@squareup.com>
Signed-off-by: Mathieu Balez <mbalez@squareup.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
wesbillman
left a comment
There was a problem hiding this comment.
Adversarial review (requested by Wes, performed by Brain 🧠). The fix itself is correct and minimal — I traced all three sync_managed_agent_profile callers and create was indeed the only one skipping effective_agent_relay_url; the one-line resolution matches the reconcile (agents.rs) and rename (agent_models.rs) siblings exactly, and relay_ws_url_with_override can never return blank (falls through to DEFAULT_RELAY_WS_URL).
One correction for the PR body: it claims ui_created_agent_blank_relay_resolves_to_reachable_host "Fails against the pre-fix logic" — it doesn't (see inline comment). Worth fixing the body so future readers don't over-trust that test.
Inline: 3 comments (test claim, guard comment overstating coverage, comment verbosity). None blocking.
| } | ||
|
|
||
| #[test] | ||
| fn ui_created_agent_blank_relay_resolves_to_reachable_host() { |
There was a problem hiding this comment.
This test does not fail against the pre-fix logic, contrary to the PR body. It only composes effective_agent_relay_url + relay_http_base_url — two pure helpers unchanged by this PR and already covered individually by empty_override_uses_workspace_relay and the base-URL suite. The pre-fix bug lived in the caller (create_managed_agent Phase 4 not invoking the helper), which no unit test exercises.
Not blocking — Phase 4 is a Tauri command that's genuinely hard to unit-test, and the new guard in sync_managed_agent_profile is the real regression protection here. But please correct the PR body's "Fails against the pre-fix logic" claim, and consider trimming this test's comment to one line — as written it documents the fix rather than what the test verifies.
| // any future caller that forgets gets a clear, actionable error instead. | ||
| let base = relay_http_base_url(relay_url); | ||
| if base.is_empty() { | ||
| return Err( |
There was a problem hiding this comment.
Two small things:
- The comment says "blank/hostless", but only blank is caught. A hostless-but-schemed input slips through:
relay_http_base_url("wss://")returns"wss:"(the trailing-slash trim defeats the prefix strip), which is non-empty and sails past this guard into the same misclassified connect failure. Pre-existing edge and fine to leave unfixed — but the comment shouldn't claim coverage it doesn't have. - This 6-line block restates the PR description; house style is the why lives in the PR, ≤1 short line in code. Something like
// Fail loudly on a blank relay URL — callers must resolve via effective_agent_relay_url first.carries the load.
Nit: the error string surfaces verbatim in the user-facing toast. Only reachable if a future caller regresses, so acceptable as a loud dev-facing failure.
| // this, a blank `resolved_relay_url` would produce a hostless `/events` POST | ||
| // that fails to connect and surfaces as a false "relay unreachable" error, | ||
| // even though the workspace relay is reachable. This mirrors the reconcile | ||
| // (`reconcile_agent_profile`) and rename (`update_managed_agent`) paths. |
There was a problem hiding this comment.
The resolution itself is exactly right (mirrors reconcile and rename). But this 8-line comment block restates the PR description in source — per house style the why belongs in the PR, at most one short line here, e.g. // Blank per-agent relay falls back to the workspace relay (matches reconcile/rename).
Fix false "relay unreachable" error on agent creation
Problem
Creating an agent via Agents → New agent → Create agent always shows an
error toast:
The relay is fully reachable — chat, message send, and the WebSocket connection
all work. The agent is created and (on next start) its profile is published
correctly. The toast is a false alarm, but it reads as data loss to the user.
Root cause
The New Agent dialog never pins a per-agent relay, so in
create_managed_agent(
desktop/src-tauri/src/commands/agents.rs)resolved_relay_urlis an emptystring. Phase 4 (profile sync) passed that empty string straight to
sync_managed_agent_profile, whererelay_http_base_url("")returns""andthe profile
POSTtargets a schemeless, hostless/events.reqwestcan'tconnect, and
classify_request_errorbuckets the failure as"relay unreachable: could not connect to relay".Every other managed-agent relay path resolves the target via
effective_agent_relay_url, which falls back to the active workspace relay whenthe per-agent value is blank. The create-time sync was the only caller that
skipped it:
agents.rs:889agents.rs:1226effective_agent_relay_urlagent_models.rs:960effective_agent_relay_urlThis is why the profile "self-heals" on next agent start — the background
reconcile_agent_profileresolves the relay correctly. Only the create-timetoast was broken.
Fix
effective_agent_relay_urlin Phase 4, mirroring the reconcile and rename paths.
sync_managed_agent_profilenow rejects a blank/hostlessrelay URL with a distinct, actionable error instead of the misleading
"relay unreachable", so any future caller that forgets to resolve the relay
fails loudly rather than emitting a false connectivity error.
Tests
Added to
desktop/src-tauri/src/relay.rs:ui_created_agent_blank_relay_resolves_to_reachable_host— regression: aUI-created agent (empty
relay_url) resolves the sync target to the reachableworkspace host, not a hostless base. Fails against the pre-fix logic.
blank_relay_http_base_is_empty— pins the trap the new guard catches.cargo test relay::→ 26 passed.cargo fmt --checkclean. Clippy clean on thechanged files.
Scope
Two files, +60/−2. No behavior change for agents that pin an explicit relay.