Skip to content

v0.22.0.0 feat(agent): agent layer v1 — codex parity, swap on the fly, second-opinion popup#69

Open
avinashjoshi wants to merge 6 commits into
mainfrom
add-codex-support
Open

v0.22.0.0 feat(agent): agent layer v1 — codex parity, swap on the fly, second-opinion popup#69
avinashjoshi wants to merge 6 commits into
mainfrom
add-codex-support

Conversation

@avinashjoshi

Copy link
Copy Markdown
Owner

Summary

v0.22.0.0 — agent layer v1. Turns "the agent" into a first-class workspace concept.

Before this PR, canopy assumed one agent per project (claude, hard-coded). Codex existed only as a known string. This PR adds three new operating modes:

  1. Pick a non-default agent at workspace creation. canopy new --agent codex. canopy.json learns an agents: [...] allowlist; legacy agent.type still honored.
  2. Swap the running agent without losing history. canopy agent swap codex (CLI, from cwd) and A (TUI, from any workspace row) replace the running pane with a fresh pane of the new launcher, preserving byte-precise tmux geometry. Per-agent launch counter (AgentLaunches[type]int) decides fresh vs resume independently per launcher — so swap-claude-→-codex-→-claude restores the original conversation.
  3. Pop open a second-opinion popup. canopy ask <agent> [--file path | prompt] (CLI) and Q (TUI) dispatch a one-shot question against a different launcher. The TUI version writes the question to ~/.canopy/tmp/ask-*.md, spawns canopy ask in a tmux display-popup, and keeps it open after the answer renders (a read -r _ prompt so fast answers don't vanish).

Foundation pieces:

  • state.Workspace grows CurrentAgent + AgentLaunches. Both omitempty; migrated lazily from the legacy global AgentLaunchCount so pre-v0.22 state files are forward-compatible.
  • agent.Classifier interface (IsRendering, IsTrustDialog) with a real implementation for codex (table-driven against pane fixtures), stubs for opencode/aider.
  • codex launcher updated for codex-cli 0.142.2 (--instructions was dropped; uses positional [PROMPT] + resume --last).

Codex review

Independent code review caught 4 findings (2 P1, 2 P2) — all fixed in the diff, pinned with regression tests:

# Sev Fix
1 P1 BuildBriefing now takes agentType; reads per-agent counter so swap-to-codex gets the FULL fresh briefing
2 P1 PlanLaunch strip-on-empty mirrors BuildArgv's guard — only pops preceding arg when it starts with -. Codex's ... --ask-for-approval on-request {{briefing}} survives the empty-briefing branch
3 P2 isInsideWorkspace uses !strings.HasPrefix(rel, "..") instead of first-char-. test — canopy agent swap works from .github/.config/.gstack
4 P2 popup command passes tmpPath as positional $1, single-quoted at outer-shell layer — $HOME with spaces / $ / ' survives

Test plan

  • go test ./... (20 packages, all green)
  • go test -race on changed packages (clean)
  • go test -tags=e2e ./... (e2e green)
  • Dogfood canopy new --agent codex against a fresh project
  • Dogfood canopy agent swap codex and back to claude; verify conversation restored
  • Dogfood canopy ask codex "..." from CLI
  • Dogfood A and Q keybinds in TUI

Deferred

Two TODOs added to TODOS.md:

  • P2: opencode resume + Exec wiring (binary broken on dogfood machine 2026-06-25 — Could not resolve npm bin for opencode-ai)
  • P3: codex per-session-ID resume (current resume --last is global-most-recent; same caveat claude's --continue has)

🤖 Generated with Claude Code

avinashjoshi and others added 6 commits June 25, 2026 22:24
…urrentAgent state)

- new Classifier interface in internal/agent (IsRendering, IsTrustDialog)
  with real implementations for codex (table-driven against testdata
  fixtures), stub implementations for opencode/aider/claude
- state.Workspace gains CurrentAgent + AgentLaunches[type]int — per-agent
  launch counter so first-time swap to a new agent gets the full fresh
  briefing even when another agent ran first. Migrated lazily from the
  legacy AgentLaunchCount onto CurrentAgent on first read; omitempty so
  pre-v0.22 state files are forward-compatible
- agent.BuildBriefing(ws, cfg, hints, agentType) keys fresh-vs-resume on
  the per-agent counter (with legacy AgentLaunchCount fallback when the
  agent matches ws.CurrentAgent)
- codex launcher uses positional [PROMPT] + 'resume --last' (CLI 0.142.2
  dropped --instructions); strip-on-empty in PlanLaunch only pops the
  preceding arg when it starts with '-' so 'on-request' value survives
- workspace.SwapAgent orchestrates the swap: capture window-layout,
  verify launcher installed BEFORE tearing down the pane, kill old
  agent pane, persist new CurrentAgent, split new pane off the IDE,
  restore byte-precise geometry, bump per-agent counter
- canopy.json learns agents: [...] allowlist; legacy agent.type still
  honored as a single-agent fallback. Unknown keys preserved via raw
  JSON-map shim. AddAgentToCanopyJSON for auto-add on pick
- tmux.CaptureWindowLayout + SelectLayout helpers for byte-precise
  geometry restore across kill-pane + split-window

Tests: classifier_test (table-driven over launchers), briefing tests
pinning swap-to-new-agent and resume gates, launchers tests pinning
codex argv survival across the empty-briefing strip, agent_swap_test
covering the lifecycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CLI surface:
- 'canopy agent swap <type>' resolves the workspace from cwd (walks up
  through dot-prefixed subdirs like .github/.config/.gstack via the
  isInsideWorkspace helper — !strings.HasPrefix(rel, '..') instead of
  the old first-char-'.' test) and dispatches to workspace.SwapAgent
- 'canopy ask <agent> [--file path | prompt]' one-shot non-interactive
  invocation against the launcher's Exec mode. ResolveExec gate runs
  BEFORE AddAgentToCanopyJSON so opencode (no Exec) errors cleanly
  without leaving a config side-effect
- 'canopy new --agent <type>' picks the launcher at workspace creation;
  remote dispatch forwards --agent to the remote canopy
- startup sweep clears stale ~/.canopy/tmp/ask-* files

TUI surface:
- 'A' opens agentSwapPickerMode (lists project's agents: allowlist plus
  any installed-but-unlisted launcher; picking the latter silently
  auto-adds it to canopy.json)
- 'Q' opens askPickerMode → askInputMode (textarea) → Ctrl+S writes the
  question to ~/.canopy/tmp/ask-<hex>.md (atomic tmpfile + rename) and
  spawns 'canopy ask' inside a tmux display-popup
- popup body wraps with '; read -r _' so fast answers don't vanish
  before the user can read them; tmpPath passed as a positional shell
  argument (bash -c '...' _ '<path>') with POSIX single-quoting so a
  $HOME with spaces / $ / ' survives intact

Tests: cmd/canopy/agent_test (isInsideWorkspace table over 10 dot/dotdot
cases), ask_test (happy path + error paths), update_ask_test (popup
command shape, posixShellQuote round-trip, positional-arg pattern),
update_agent_swap_test (picker nav + Enter dispatch + cancel).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The codex review P1 #3 fix added launcher.VerifyInstalled (=exec.LookPath)
as Step 2 of SwapAgent — runs BEFORE any tmux/state mutation so an
allowed-but-not-installed launcher refuses cleanly. CI runners don't
have @openai/codex installed, so the swap tests now fail at the
VerifyInstalled gate with 'codex not found on PATH'.

Fix: stubAgentBinaries(t, "claude", "codex") creates no-op executables
in a t.TempDir() and prepends to PATH via t.Setenv. The stubs are
read-only sentinels — VerifyInstalled doesn't care what the binary
does, only that LookPath finds it.

fixtureWithAgents calls stubAgentBinaries first so every existing test
under it picks up the stubs automatically. No production-code change.

Verified locally: tests pass both WITH real codex on PATH (no stub
interference) and with PATH stripped to /usr/bin:/bin (CI simulation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… IsRendering, codex chevron, swap CLI message

bug_001 (NORMAL) — SwapAgent gave new agent EMPTY briefing on first
swap. Step 5 mutated CurrentAgent to newType before Step 6 called
BuildBriefing, so launchCountFor's `agentType == CurrentAgent`
fallback returned the legacy AgentLaunchCount (3 from claude) for
codex. BuildBriefing saw count > 0, hints == nil, returned "". codex
spawned via PlanLaunch with no positional [PROMPT] — zero context.
The CHANGELOG's headline P1 #1 fix from v0.22.0.0 was silently
broken end-to-end. Two-layer fix:
  - agent_swap.go Step 5 now initializes AgentLaunches[newType]=0
    when absent, so launchCountFor finds the explicit zero.
  - briefing.go launchCountFor's legacy fallback only fires when
    AgentLaunches is nil entirely (genuine pre-v0.22 row, not just a
    missing key on a migrated map). Defense in depth — any caller
    forgetting to seed the per-agent key still gets the right briefing.
  - Test fixture updated to mirror production: CurrentAgent="codex"
    (the NEW agent), AgentLaunches with explicit "codex":0. Added a
    second test (BareSwapStillFresh) that documents the briefing-side
    robustness without the SwapAgent init.

bug_008 (NORMAL) — codexClassifier.IsRendering failed in production.
codex's idle markers live at the TOP of the visible pane (banner
rows 1-6, footer row ~15), but the implementation matched against
bottomLines(content, 12). tmux capture-pane -p preserves trailing
blank rows that codex hasn't drawn into, so a typical 50-row pane
yielded 12 blank lines and the markers never matched. canopy new
--agent codex --prompt always timed out at Phase 1/2 with "Phase 1
timeout: neither trust dialog nor agent ready marker appeared in 5s".
The unit test masked the bug because readFixture stripped trailing
blanks before classification; production saw the raw shape. Fix:
trim trailing blank rows and match against the full visible content.
Added readFixtureRaw + TestCodexClassifier_IsRendering_HandlesRawPaneContent
which uses the un-trimmed fixture, plus a non-codex-pane rejection
test to guard against an over-eager match.

bug_006 (NORMAL) — normalize() didn't strip codex's › chevron.
The inputLine regex was hardcoded to claude's ❯ (U+276F), so every
keystroke into an idle codex pane flipped the normalized-content hash;
Detector.Observe ran its motion check before idle marker matching and
returned StateThinking at confidence 9. The TUI badge column said
⚡ Thinking while the user was composing. Fix: extend regex to char
class [❯›] — claude's ❯ + codex's › (U+203A). Added paired regression
tests for both chevrons so an over-eager rewrite can't drop either.

bug_002 (NIT) — `canopy agent swap <type>` printed `Swapped <new>
→ <new>` instead of `<old> → <new>`. The success Fprintf used
ws.CurrentAgent (already mutated to newType by SwapAgent Step 5)
as the FROM. Fix: capture the row's CurrentAgent BEFORE SwapAgent
via a new currentAgentForWorkspace helper that walks mgr.List(ctx)
the same way findWorkspaceFromCwd already does.

A test expectation in TestCodexClassifier_AgainstRealFixtures
(codex_awaiting_input case) flipped from wantRender:false to
wantRender:true — the previous value was bug-compatible with the
old bottomLines(12) shape, not the semantic-correct answer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant