Skip to content

fix(shell): DEV baseUrl must target the sidecar on :4141, not :4142#119

Open
stuffbucket wants to merge 5 commits into
mainfrom
fix/shell-dev-baseurl-4141
Open

fix(shell): DEV baseUrl must target the sidecar on :4141, not :4142#119
stuffbucket wants to merge 5 commits into
mainfrom
fix/shell-dev-baseurl-4141

Conversation

@stuffbucket

Copy link
Copy Markdown
Owner

Regression from #114. The sidecar binds :4141 (SIDECAR_PORT in shell/src-tauri/src/lib.rs) in both app:dev and production. #114 changed the DEV baseUrl to :4142 — trusting CLAUDE.md, which WRONGLY claimed the sidecar runs on :4142 in four places. In app:dev the webview is the Vite dev server (import.meta.env.DEV), so every settings/auth/gh/accounts call went to a dead :4142: sign-in failed, the gh-reuse list and account roster were empty, and the console filled with connection-refused errors. (Production was unaffected — it uses a relative path.)

Default the DEV branch back to :4141 (the canonical port app:dev serves on), keeping VITE_API_BASE as the override for an app:ui proxy elsewhere. Also corrects the four :4142 lies in CLAUDE.md to :4141, citing SIDECAR_PORT.

Regression from #114. The sidecar binds :4141 (SIDECAR_PORT in
shell/src-tauri/src/lib.rs) in both app:dev and production. #114 changed the
DEV baseUrl to :4142 — trusting CLAUDE.md, which WRONGLY claimed the sidecar
runs on :4142 in four places. In `app:dev` the webview is the Vite dev server
(import.meta.env.DEV), so every settings/auth/gh/accounts call went to a dead
:4142: sign-in failed, the gh-reuse list and account roster were empty, and
the console filled with connection-refused errors. (Production was unaffected
— it uses a relative path.)

Default the DEV branch back to :4141 (the canonical port app:dev serves on),
keeping VITE_API_BASE as the override for an app:ui proxy elsewhere. Also
corrects the four :4142 lies in CLAUDE.md to :4141, citing SIDECAR_PORT.
…dings)

Follow-on to the :4141 baseUrl fix, from live app:dev testing where sign-in
appeared to do nothing:
- safeInvoke now returns whether the IPC fired. rebootAndAwaitAuth fails fast
  and VISIBLY when restart_sidecar does not fire (invoke rejected/unavailable)
  instead of polling a never-changing status for 20s then surfacing an
  off-screen message. This is what made the writes look like silent failures:
  POST /gh/use + /accounts/switch returned 200 (token written), but the reboot
  never happened and the error landed below the fold.
- Error messages (gh-reuse + roster) scrollIntoView so they are not stranded
  off-screen at the bottom of a tall account section.
- The gh-reuse list now excludes accounts already remembered in the registry,
  so an account no longer appears in BOTH "Switch to a remembered account" and
  "Reuse a GitHub CLI account" (the confusing double-listing).

Does NOT yet explain WHY restart_sidecar failed to fire in that app:dev run —
the swallowed reason goes to the webview console; the immediate visible error
now exposes it on the next attempt.
…he webview

THE root cause of "sign-in does nothing". Tauri 2 does not auto-permission an
app's own commands; the Settings/Dashboard webviews load from loopback origins
(:4141 in production, :1420 in dev) which Tauri classifies as "remote", so a
command is rejected unless it's named in permissions/default.toml. Every other
webview-invoked command (open_settings_at, reveal_*, get_shell_api_key,
subscribe_token_usage) is listed there — but restart_sidecar was not.

So invoke("restart_sidecar") was denied by the ACL for the entire reboot-based
flow — gh-reuse sign-in, account switch, AND sign-out. The token got written
(POST /gh/use 200) but the proxy never rebooted into it, so the UI sat
unauthenticated. This was never caught because the reboot path was "trusted,
not GUI-smoked" (WKWebView has no WebDriver). It affected production, not just
app:dev.

Add allow-restart-sidecar to the default permission set + its [[permission]]
block. Verified the ACL regenerates with it included.
Once the ACL fix let restart_sidecar actually fire, a second bug surfaced:
restarting the proxy quit the whole app. kill_sidecar SIGTERMs the sidecar,
which exits CLEANLY (code 0); the Terminated handler reads any clean exit as
"user wants the whole app down" (true for Quit / external /_internal/shutdown)
and calls handle.exit(0) — so the deliberate account-switch/sign-in/sign-out
reboot took the tray + windows down with it before the respawn could land.

Add a SidecarRestarting flag: respawn_sidecar sets it before SIGTERM, and the
Terminated handler consumes it and keeps the app alive (the replacement is
already being spawned) instead of exiting. A genuine Quit / external shutdown
leaves the flag unset and still exits as before. cargo check passes.
THE reason the proxy stayed dead after a restart (UI shows "Load failed"
everywhere because nothing is listening on :4141, while the app window is
still up). kill_sidecar SIGTERMs the old sidecar, puts it BACK in the shared
Sidecar slot, and spawns a 3s thread that takes whatever is in the slot and
SIGKILLs it. But respawn_sidecar calls spawn_sidecar immediately after, which
set()s the REPLACEMENT child into that same slot — so ~3s after a restart the
stale escalation thread SIGKILLs the freshly-respawned sidecar. New proxy
comes up, dies 3s later, every Settings call then fails to load.

Fix: move the old child INTO the escalation thread instead of the shared slot,
so it SIGKILLs only that specific child and leaves the slot free for the
respawn. Dropping a CommandChild does not kill its process, so the slot can
sit empty until spawn_sidecar fills it.

Completes the reboot chain (with the :4141 baseUrl, the restart_sidecar ACL
permission, and the survive-the-restart flag): switch / gh-reuse sign-in /
sign-out now actually reboot into the account and stay up. cargo check passes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant