v0.22.0.0 feat(agent): agent layer v1 — codex parity, swap on the fly, second-opinion popup#69
Open
avinashjoshi wants to merge 6 commits into
Open
v0.22.0.0 feat(agent): agent layer v1 — codex parity, swap on the fly, second-opinion popup#69avinashjoshi wants to merge 6 commits into
avinashjoshi wants to merge 6 commits into
Conversation
…urrentAgent state) - new Classifier interface in internal/agent (IsRendering, IsTrustDialog) with real implementations for codex (table-driven against testdata fixtures), stub implementations for opencode/aider/claude - state.Workspace gains CurrentAgent + AgentLaunches[type]int — per-agent launch counter so first-time swap to a new agent gets the full fresh briefing even when another agent ran first. Migrated lazily from the legacy AgentLaunchCount onto CurrentAgent on first read; omitempty so pre-v0.22 state files are forward-compatible - agent.BuildBriefing(ws, cfg, hints, agentType) keys fresh-vs-resume on the per-agent counter (with legacy AgentLaunchCount fallback when the agent matches ws.CurrentAgent) - codex launcher uses positional [PROMPT] + 'resume --last' (CLI 0.142.2 dropped --instructions); strip-on-empty in PlanLaunch only pops the preceding arg when it starts with '-' so 'on-request' value survives - workspace.SwapAgent orchestrates the swap: capture window-layout, verify launcher installed BEFORE tearing down the pane, kill old agent pane, persist new CurrentAgent, split new pane off the IDE, restore byte-precise geometry, bump per-agent counter - canopy.json learns agents: [...] allowlist; legacy agent.type still honored as a single-agent fallback. Unknown keys preserved via raw JSON-map shim. AddAgentToCanopyJSON for auto-add on pick - tmux.CaptureWindowLayout + SelectLayout helpers for byte-precise geometry restore across kill-pane + split-window Tests: classifier_test (table-driven over launchers), briefing tests pinning swap-to-new-agent and resume gates, launchers tests pinning codex argv survival across the empty-briefing strip, agent_swap_test covering the lifecycle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CLI surface: - 'canopy agent swap <type>' resolves the workspace from cwd (walks up through dot-prefixed subdirs like .github/.config/.gstack via the isInsideWorkspace helper — !strings.HasPrefix(rel, '..') instead of the old first-char-'.' test) and dispatches to workspace.SwapAgent - 'canopy ask <agent> [--file path | prompt]' one-shot non-interactive invocation against the launcher's Exec mode. ResolveExec gate runs BEFORE AddAgentToCanopyJSON so opencode (no Exec) errors cleanly without leaving a config side-effect - 'canopy new --agent <type>' picks the launcher at workspace creation; remote dispatch forwards --agent to the remote canopy - startup sweep clears stale ~/.canopy/tmp/ask-* files TUI surface: - 'A' opens agentSwapPickerMode (lists project's agents: allowlist plus any installed-but-unlisted launcher; picking the latter silently auto-adds it to canopy.json) - 'Q' opens askPickerMode → askInputMode (textarea) → Ctrl+S writes the question to ~/.canopy/tmp/ask-<hex>.md (atomic tmpfile + rename) and spawns 'canopy ask' inside a tmux display-popup - popup body wraps with '; read -r _' so fast answers don't vanish before the user can read them; tmpPath passed as a positional shell argument (bash -c '...' _ '<path>') with POSIX single-quoting so a $HOME with spaces / $ / ' survives intact Tests: cmd/canopy/agent_test (isInsideWorkspace table over 10 dot/dotdot cases), ask_test (happy path + error paths), update_ask_test (popup command shape, posixShellQuote round-trip, positional-arg pattern), update_agent_swap_test (picker nav + Enter dispatch + cancel). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The codex review P1 #3 fix added launcher.VerifyInstalled (=exec.LookPath) as Step 2 of SwapAgent — runs BEFORE any tmux/state mutation so an allowed-but-not-installed launcher refuses cleanly. CI runners don't have @openai/codex installed, so the swap tests now fail at the VerifyInstalled gate with 'codex not found on PATH'. Fix: stubAgentBinaries(t, "claude", "codex") creates no-op executables in a t.TempDir() and prepends to PATH via t.Setenv. The stubs are read-only sentinels — VerifyInstalled doesn't care what the binary does, only that LookPath finds it. fixtureWithAgents calls stubAgentBinaries first so every existing test under it picks up the stubs automatically. No production-code change. Verified locally: tests pass both WITH real codex on PATH (no stub interference) and with PATH stripped to /usr/bin:/bin (CI simulation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # CHANGELOG.md # VERSION
… IsRendering, codex chevron, swap CLI message bug_001 (NORMAL) — SwapAgent gave new agent EMPTY briefing on first swap. Step 5 mutated CurrentAgent to newType before Step 6 called BuildBriefing, so launchCountFor's `agentType == CurrentAgent` fallback returned the legacy AgentLaunchCount (3 from claude) for codex. BuildBriefing saw count > 0, hints == nil, returned "". codex spawned via PlanLaunch with no positional [PROMPT] — zero context. The CHANGELOG's headline P1 #1 fix from v0.22.0.0 was silently broken end-to-end. Two-layer fix: - agent_swap.go Step 5 now initializes AgentLaunches[newType]=0 when absent, so launchCountFor finds the explicit zero. - briefing.go launchCountFor's legacy fallback only fires when AgentLaunches is nil entirely (genuine pre-v0.22 row, not just a missing key on a migrated map). Defense in depth — any caller forgetting to seed the per-agent key still gets the right briefing. - Test fixture updated to mirror production: CurrentAgent="codex" (the NEW agent), AgentLaunches with explicit "codex":0. Added a second test (BareSwapStillFresh) that documents the briefing-side robustness without the SwapAgent init. bug_008 (NORMAL) — codexClassifier.IsRendering failed in production. codex's idle markers live at the TOP of the visible pane (banner rows 1-6, footer row ~15), but the implementation matched against bottomLines(content, 12). tmux capture-pane -p preserves trailing blank rows that codex hasn't drawn into, so a typical 50-row pane yielded 12 blank lines and the markers never matched. canopy new --agent codex --prompt always timed out at Phase 1/2 with "Phase 1 timeout: neither trust dialog nor agent ready marker appeared in 5s". The unit test masked the bug because readFixture stripped trailing blanks before classification; production saw the raw shape. Fix: trim trailing blank rows and match against the full visible content. Added readFixtureRaw + TestCodexClassifier_IsRendering_HandlesRawPaneContent which uses the un-trimmed fixture, plus a non-codex-pane rejection test to guard against an over-eager match. bug_006 (NORMAL) — normalize() didn't strip codex's › chevron. The inputLine regex was hardcoded to claude's ❯ (U+276F), so every keystroke into an idle codex pane flipped the normalized-content hash; Detector.Observe ran its motion check before idle marker matching and returned StateThinking at confidence 9. The TUI badge column said ⚡ Thinking while the user was composing. Fix: extend regex to char class [❯›] — claude's ❯ + codex's › (U+203A). Added paired regression tests for both chevrons so an over-eager rewrite can't drop either. bug_002 (NIT) — `canopy agent swap <type>` printed `Swapped <new> → <new>` instead of `<old> → <new>`. The success Fprintf used ws.CurrentAgent (already mutated to newType by SwapAgent Step 5) as the FROM. Fix: capture the row's CurrentAgent BEFORE SwapAgent via a new currentAgentForWorkspace helper that walks mgr.List(ctx) the same way findWorkspaceFromCwd already does. A test expectation in TestCodexClassifier_AgainstRealFixtures (codex_awaiting_input case) flipped from wantRender:false to wantRender:true — the previous value was bug-compatible with the old bottomLines(12) shape, not the semantic-correct answer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
v0.22.0.0 — agent layer v1. Turns "the agent" into a first-class workspace concept.
Before this PR, canopy assumed one agent per project (claude, hard-coded). Codex existed only as a known string. This PR adds three new operating modes:
canopy new --agent codex.canopy.jsonlearns anagents: [...]allowlist; legacyagent.typestill honored.canopy agent swap codex(CLI, from cwd) andA(TUI, from any workspace row) replace the running pane with a fresh pane of the new launcher, preserving byte-precise tmux geometry. Per-agent launch counter (AgentLaunches[type]int) decides fresh vs resume independently per launcher — so swap-claude-→-codex-→-claude restores the original conversation.canopy ask <agent> [--file path | prompt](CLI) andQ(TUI) dispatch a one-shot question against a different launcher. The TUI version writes the question to~/.canopy/tmp/ask-*.md, spawnscanopy askin atmux display-popup, and keeps it open after the answer renders (aread -r _prompt so fast answers don't vanish).Foundation pieces:
state.WorkspacegrowsCurrentAgent+AgentLaunches. Bothomitempty; migrated lazily from the legacy globalAgentLaunchCountso pre-v0.22 state files are forward-compatible.agent.Classifierinterface (IsRendering,IsTrustDialog) with a real implementation for codex (table-driven against pane fixtures), stubs for opencode/aider.--instructionswas dropped; uses positional[PROMPT]+resume --last).Codex review
Independent code review caught 4 findings (2 P1, 2 P2) — all fixed in the diff, pinned with regression tests:
BuildBriefingnow takesagentType; reads per-agent counter so swap-to-codex gets the FULL fresh briefingPlanLaunchstrip-on-empty mirrorsBuildArgv's guard — only pops preceding arg when it starts with-. Codex's... --ask-for-approval on-request {{briefing}}survives the empty-briefing branchisInsideWorkspaceuses!strings.HasPrefix(rel, "..")instead of first-char-.test —canopy agent swapworks from.github/.config/.gstacktmpPathas positional$1, single-quoted at outer-shell layer —$HOMEwith spaces /$/'survivesTest plan
go test ./...(20 packages, all green)go test -raceon changed packages (clean)go test -tags=e2e ./...(e2e green)canopy new --agent codexagainst a fresh projectcanopy agent swap codexand back to claude; verify conversation restoredcanopy ask codex "..."from CLIAandQkeybinds in TUIDeferred
Two TODOs added to TODOS.md:
Could not resolve npm bin for opencode-ai)resume --lastis global-most-recent; same caveat claude's--continuehas)🤖 Generated with Claude Code