diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/.openspec.yaml b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/.openspec.yaml new file mode 100644 index 0000000..81cd71f --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/.openspec.yaml @@ -0,0 +1,2 @@ +schema: spec-driven +created: 2026-05-11 diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/01-interactive-recovery.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/01-interactive-recovery.md new file mode 100644 index 0000000..9878ac3 --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/01-interactive-recovery.md @@ -0,0 +1,57 @@ +# Gap 01 — Interactive recovery verb (`gx recover`) + +## Problem + +When an agent lane stalls, the current answer is `scripts/agent-autofinish-watch.sh --auto-merge`, which after 15 minutes of file silence commits whatever is in the worktree, pushes, and tries to merge. That is a daemon for the *common* case (idle worktree, clean intent). It is the wrong tool when the lane is actually broken: dirty submodule, missing lock claim, unmerged PR check failures, raced base-branch push, or a half-applied edit the agent meant to revert. + +There is no command that lets a human (or a fresh agent) ask: "this branch is stuck — tell me *why*, and propose the narrow fix." + +## Evidence in current code + +- `scripts/agent-autofinish-watch.sh` (commit-then-push pipeline, idle-driven). +- `bin/agent-stalled-report.sh` (wired as `SessionStart` hook — emits one-line summary per stalled branch). +- `src/agents/status.js` already builds rich `Session` records via `buildAgentsStatusPayload(repoRoot)` including `worktreeExists`, `lockCount`, `claimedFiles`, `changedFiles`, `prUrl`, `prState`. +- Attention inbox at session start: 3 codex lanes stalled at 9 m, 14 m, 33 m old — recurring real-world signal. + +## Proposed CLI surface + +```bash +gx recover # diagnostic mode: print causes, no actions +gx recover --apply # take the safest single recommended action +gx recover --dry-run # equivalent to bare invocation (default) +gx recover --all # diagnose every stranded branch +gx recover --json # machine-readable for cockpit / dashboard +``` + +Diagnostic output buckets the lane into one of: + +- `clean-idle` → recommend `gx branch finish ... --via-pr --wait-for-merge --cleanup`. +- `dirty-uncommitted` → recommend `git -C commit -am "wip recover"` then finish. +- `unclaimed-files` → list the unclaimed paths, recommend `gx locks claim ...`. +- `submodule-pointer-drift` → recommend `gx submodule advance` inside the worktree. +- `pr-open-checks-red` → link the PR + failing check name. +- `worktree-missing` → recommend `git worktree prune` + branch deletion if no commits exist. +- `unknown` → dump the raw evidence and STOP (do not auto-apply). + +## Tier / effort + +- **Tier**: T2. +- **Effort**: ~6 files / ~1 day. New `src/recover/index.js` + dispatch entry in `src/cli/main.js` + arg parsing in `src/cli/args.js` + tests + manifest entry. Reuses `agents/status.js`, `git/index.js`, and `submoduleModule` primitives. + +## Dependencies + +None. Ships independently. Pairs well with Gap 03 (stranded filter) but does not depend on it. + +## Open questions + +- Should `gx recover --apply` run the finish pipeline directly, or print the exact command and let the user/agent paste? Lean **print**, since the recovery verb is meant to be diagnostic-first. +- Should it touch other agents' lanes (`--all`)? Probably yes, but with `--apply` refusing to act on lanes whose lock owner is not the current user/agent. +- Where do we surface unmerged-PR check failures? Likely via `gh pr checks` shell-out, gated on `gh` presence. + +## Acceptance criteria + +- [ ] `gx recover ` exits 0 with a human-readable diagnosis for every state in the bucket list. +- [ ] `gx recover --json` returns `{ branch, state, evidence: {...}, recommended: { command: "...", reason: "..." } }`. +- [ ] `gx recover --all` iterates every `agent/*` branch (current-user-owned) and prints one diagnosis per lane. +- [ ] `--apply` performs the recommended action only for the `clean-idle`, `dirty-uncommitted`, and `unclaimed-files` states; refuses on the rest with an explanation. +- [ ] Regression test covers each bucket against a fixture worktree. diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/02-structured-observability.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/02-structured-observability.md new file mode 100644 index 0000000..1638e3c --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/02-structured-observability.md @@ -0,0 +1,59 @@ +# Gap 02 — Structured observability + +## Problem + +`gx` has no first-class machine-readable surface. `gx agents --json` does not work as a flat flag (`gx agents --json` returns `Unknown agents subcommand: --json` on the current `main` branch). There is no append-only event log of lane lifecycle events (`branch-start`, `lock-claim`, `commit`, `push`, `pr-open`, `merge`, `cleanup`, `stall-detected`). External tools — the Colony planner UI, cockpit, dashboards, scripts that want to watch CI — fall back to parsing human-readable text or scanning `.omc/agent-worktrees/` directly. + +The cost shows up two places: planner cards do not refresh on real state changes, and stalled-lane recovery (Gap 01) has to re-derive history every time instead of reading an append-only stream. + +## Evidence in current code + +- `src/agents/status.js:86 buildAgentsStatusPayload()` already returns a structured `{ schemaVersion: 1, repoRoot, sessions: [...] }` payload — the data exists, just not exposed at the top-level CLI. +- `src/agents/status.js:110 renderAgentsStatus(payload, { json: true })` returns the JSON string; nothing calls it from `gx agents` directly. +- `src/cli/main.js:2692 function agents(rawArgs)` dispatches subcommands only. +- Repeated hand-grepping of `.omc/agent-worktrees/*/manifest.json` in `scripts/agent-autofinish-watch.sh` and `bin/agent-stalled-report.sh`. + +## Proposed CLI surface + +```bash +gx agents --json # flat surface; same payload as today's subcommand +gx status --json # repo-level health (delegates to scan + agents) +gx events tail [--since=15m] [--branch=...] # stream of NDJSON events +gx events log [--last=100] [--branch=...] # bounded historical query +``` + +Event log shape (NDJSON, one event per line, append-only file at `.omc/events.ndjson`): + +```json +{"ts":"2026-05-11T16:08:00Z","kind":"branch-start","branch":"agent/claude/...","agent":"claude-opus","tier":"T2","worktree":".omc/agent-worktrees/..."} +{"ts":"2026-05-11T16:09:14Z","kind":"lock-claim","branch":"agent/claude/...","files":["proposal.md","tasks.md"]} +{"ts":"2026-05-11T16:30:02Z","kind":"pr-open","branch":"agent/claude/...","prUrl":"https://github.com/recodeee/gitguardex/pull/NNN"} +{"ts":"2026-05-11T16:35:11Z","kind":"merge","branch":"agent/claude/...","sha":"abc1234"} +``` + +Writer is the `gx branch start/finish`, `gx locks claim`, and the finish pipeline. Events file is gitignored by default and rotated by size in a follow-up. + +## Tier / effort + +- **Tier**: T2. +- **Effort**: ~10 files / ~1.5 days. New `src/events/index.js` (append/tail/log helpers) + integration calls from `start.js`, `finish/index.js`, `locks` writers + flat `--json` dispatch on `gx agents` and `gx status` + tests. + +## Dependencies + +None to ship. Unblocks Gap 04 (`gx resolve` reads the event log to detect repeated collisions) and Gap 05 (lock enforcement layer needs an audit trail anyway). + +## Open questions + +- Append-only file vs. SQLite? Lean **NDJSON**: zero deps, easy to tail in bash, easy to rotate. +- Per-repo only, or also a user-level `~/.gx/events.ndjson` aggregation? Start per-repo; user-level is a follow-up. +- Schema versioning: every event MUST include `schemaVersion: 1`. Bumps require an entry in `roadmap.md`. + +## Acceptance criteria + +- [ ] `gx agents --json` (flat) returns the same payload as the legacy subcommand, exits 0. +- [ ] `gx status --json` returns `{ schemaVersion, repoRoot, doctor: {...}, agents: {...} }`. +- [ ] `gx events tail` streams new events as they are written; `--since=15m` filters. +- [ ] `gx events log --last=100` returns the most recent N events in newest-first order. +- [ ] All write paths (`branch start`, `branch finish`, `locks claim/release`, `cleanup`) emit events. +- [ ] `.gitignore` updated to exclude `.omc/events.ndjson` and any rotated `.omc/events-*.ndjson`. +- [ ] Regression test: spawn a fixture branch, run the full lifecycle, assert event sequence. diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/03-stranded-lane-inventory.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/03-stranded-lane-inventory.md new file mode 100644 index 0000000..7daf468 --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/03-stranded-lane-inventory.md @@ -0,0 +1,55 @@ +# Gap 03 — Stranded-lane inventory (`gx agents --stranded`) + +## Problem + +The data needed to list "lanes that have not made progress in N minutes" already exists in `src/agents/status.js`. What is missing is a filter and an explicit "stranded" classification. Today, to find them, you either read the attention inbox (which is Colony state, not gx state), or you eyeball `git worktree list` and infer from commit ages. + +Concrete present-tense pain: this session's attention inbox shows 3 stranded codex lanes (9 m, 14 m, 33 m old). The Claude session-start hook (`scripts/agent-stalled-report.sh`) surfaces them but does not let you query, sort, or feed them to another tool. + +## Evidence in current code + +- `src/agents/status.js:86 buildAgentsStatusPayload()` already returns per-session `activity`, `worktreeExists`, `changedFiles`, `lockCount`. None of these are exposed as filter knobs. +- `scripts/agent-autofinish-watch.sh` has its own ad-hoc "stranded" definition based on file-mtime silence — duplicated logic. +- `bin/agent-stalled-report.sh` is a one-shot wrapper that does not accept filters. +- `gx agents` (no args) prints "Agent sessions: none" in the primary checkout; the data is per-worktree but the listing is not. + +## Proposed CLI surface + +```bash +gx agents # existing behavior, unchanged +gx agents --stranded # only lanes whose last activity > 15m ago +gx agents --stranded --age=30m # custom threshold +gx agents --stranded --json # machine-readable for cockpit / planner +gx agents --owned-by claude-opus # filter by agent (orthogonal to --stranded) +gx agents --tier T2 # filter by tier +gx agents --no-pr # lanes with no PR URL yet +``` + +The "stranded" definition codifies what `agent-autofinish-watch.sh` already does: + +- last mtime of any file inside the worktree > `--age` (default 15 m), AND +- no merged PR, AND +- `worktreeExists` is true. + +## Tier / effort + +- **Tier**: T1 (≤ 5 files, single capability, no API/schema change). +- **Effort**: ~3 files / ~half day. New filter helpers in `src/agents/status.js` + new `--stranded` / `--age` / `--owned-by` / `--tier` / `--no-pr` flag parsing in `src/cli/args.js` (or wherever `agents` parses its rest) + tests. + +## Dependencies + +None. Independently shippable. Pairs naturally with Gap 01 (`gx recover --all` would internally call this filter). + +## Open questions + +- Default `--age` threshold: 15 m (matches `agent-autofinish-watch.sh --idle-seconds=900`) or shorter (5 m) for tighter feedback? Lean **15 m** to match existing semantics. +- Should `--stranded` exit non-zero when one or more lanes are stranded? Lean **yes**, so it composes into CI / shell pipelines (`gx agents --stranded || gx recover --all`). + +## Acceptance criteria + +- [ ] `gx agents --stranded` lists only lanes meeting the stranded criterion; empty output and exit 0 when none. +- [ ] `gx agents --stranded --json` returns the filtered payload with the same schema as the unfiltered `--json`. +- [ ] `--age=` accepts `Ns`, `Nm`, `Nh` and rejects invalid input with a clear error. +- [ ] `--owned-by`, `--tier`, `--no-pr` compose with `--stranded` via logical AND. +- [ ] `--stranded` exits non-zero (e.g. exit 2) when at least one stranded lane is found, so it works in shell guards. +- [ ] Regression test fixtures: zero lanes, one stranded lane, one active lane, mixed. diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/04-conflict-resolution-verb.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/04-conflict-resolution-verb.md new file mode 100644 index 0000000..a48bfc4 --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/04-conflict-resolution-verb.md @@ -0,0 +1,65 @@ +# Gap 04 — Conflict resolution verb (`gx resolve`) + +## Problem + +When two agent lanes both touch `.gitmodules`, a lockfile (`pnpm-lock.yaml`, `package-lock.json`, `Cargo.lock`), or a generated artifact (built CSS, generated OpenAPI client, schema dumps), the lanes collide at merge time. gx has no primitive for this. Agents drop to raw `git merge`, `git rebase --strategy-option=ours`, or `git checkout --theirs` — each of which can quietly destroy work in a sparse-checkout agent worktree. + +The pattern is recurring and well-known internally: an existing `submodule-pointer-conflict-resolver` agent worktree has been sitting open with no commits for hours (visible in `git worktree list`), because the conflict-handling story is ambiguous enough that the agent declined to act. + +## Evidence in current code + +- `git worktree list` includes `agent/codex/...submodule-pointer-conflict-resolver` lanes with zero commits. +- `src/submodule/index.js` has `advance()` for forward-only pointer bumps but no merge-strategy support. +- `src/finish/index.js:241 finish()` calls `branchFinish` asset; no pre-merge conflict-resolution hook. +- Memory 5001: "submodule-pointer-conflict-resolver Worktree Is Freshly Created with No New Commits Yet" — observed pattern. + +## Proposed CLI surface + +```bash +gx resolve # inspect, print plan, no actions +gx resolve --strategy= # apply strategy +gx resolve --auto # scan whole worktree, pick strategy per path +``` + +Strategies (each path-class has one default): + +- `--strategy=submodule-tip` → for `.gitmodules` collisions: take the newer remote tip of every submodule. Refuses if either submodule has uncommitted work. +- `--strategy=lockfile-regen` → for lockfiles: delete, re-run the matching package manager (`pnpm install`, `npm install`, `cargo update`), commit the result. +- `--strategy=generated-rebuild` → for declared generated artifacts: delete, run the registered rebuild command (from `package.json` `scripts.` or `.gx/resolve.json`), commit. +- `--strategy=ours` / `--strategy=theirs` → escape hatches; warn loudly. + +`.gx/resolve.json` (new, optional) declares per-path strategies so `--auto` knows what to do without a flag: + +```json +{ + "rules": [ + { "path": "pnpm-lock.yaml", "strategy": "lockfile-regen", "command": "pnpm install --frozen-lockfile=false" }, + { "path": ".gitmodules", "strategy": "submodule-tip" }, + { "path": "apps/docs/openapi.json", "strategy": "generated-rebuild", "command": "pnpm --filter docs gen:openapi" } + ] +} +``` + +## Tier / effort + +- **Tier**: T2. +- **Effort**: ~8 files / ~2 days. New `src/resolve/index.js` + dispatch entry + `.gx/resolve.json` reader + per-strategy implementations + tests + docs entry in the relevant capability context. + +## Dependencies + +- **Soft on Gap 02** (Structured observability): `gx resolve` should emit `resolve-applied` events so repeat collisions are visible in `gx events log`. Ship without it if Gap 02 is not yet ready; backfill events later. +- Pre-commit hook needs an allowlist so `--strategy=lockfile-regen` commits do not trip "unclaimed files" guard mid-resolve. + +## Open questions + +- Should `gx resolve` operate only inside the agent worktree, or also on the primary checkout during a finish-time merge conflict? Lean **worktree-only**; primary stays read-only. +- Where does `gx resolve --auto` get its rule set when `.gx/resolve.json` is absent — built-in defaults only, or refuse and require explicit `--strategy`? Lean **built-in defaults for the three named path classes, refuse otherwise**. + +## Acceptance criteria + +- [ ] `gx resolve ` prints the chosen strategy and the exact commands it would run, exits 0, makes no changes. +- [ ] `gx resolve --strategy=submodule-tip` updates `.gitmodules` pointers to the latest remote tip and commits when no submodule is dirty; refuses with a clear error otherwise. +- [ ] `gx resolve --strategy=lockfile-regen` regenerates the lockfile via the registered command and commits the result. +- [ ] `gx resolve --auto` reads `.gx/resolve.json` (or built-in defaults) and resolves every collision in one pass. +- [ ] `--ours` / `--theirs` print a loud warning and still execute. +- [ ] Regression tests cover each strategy against a fixture worktree pair with synthesized collisions. diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/05-cross-process-lock-enforcement.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/05-cross-process-lock-enforcement.md new file mode 100644 index 0000000..83a1814 --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/05-cross-process-lock-enforcement.md @@ -0,0 +1,66 @@ +# Gap 05 — Cross-process lock enforcement + +## Problem + +`gx locks claim` writes a JSON manifest at the repo's lock-file path. The pre-commit hook later refuses commits that touch unclaimed files. Between claim and commit, nothing prevents another process (a different IDE, a different agent in a different worktree, a stray script) from saving the file. Pre-commit catches it, but **at the worst possible time**: after the user has invested edit work that now has to be reverted. + +The contract today is advisory, not enforced. For a tool whose purpose is multi-agent safety, that is the weakest link. + +## Evidence in current code + +- `src/git/index.js lockRegistryStatus(...)` reads the manifest; no fs-level enforcement. +- Pre-commit hook (in `templates/`) runs `gx locks ...` validation only on `git commit`. +- No editor extension, no fsmonitor, no LSP hook, no inotify watch. +- Memory 5006: hooks live in `.githooks` via `git config core.hooksPath` — established hook surface, but pre-commit timing only. + +## Proposed surface (research-required) + +This gap is the only one in the roadmap that is **not** a self-contained CLI verb. It is a platform decision: where do we hook in to enforce locks at edit time instead of at commit time? + +Candidates (each has its own R&D cost): + +| Approach | Pros | Cons | +|-----------------------|-------------------------------------------------------|----------------------------------------------------------------------| +| VS Code extension | Already have `Recodee.gitguardex-active-agents` extension id reserved (see `main.js:1268`). | Editor-specific; no Vim/Emacs/Cursor/Neovim coverage by default. | +| `fsmonitor` daemon | Native git integration since 2.36; cross-editor. | Requires `git config core.fsmonitor` adoption; per-repo only. | +| `inotify`/`fswatch` watcher | Cross-editor; runs alongside the agent shell. | Spurious wakes; daemon lifecycle management. | +| LSP layer (Claude/Cursor) | Catches the *intent* to edit, not just the save. | Tied to specific clients; not portable. | +| Pre-write `EDITOR` wrapper | Minimal infra; just wraps `$EDITOR`. | Bypassed by IDE saves; only catches terminal editors. | + +Hybrid recommendation: **fsmonitor + IDE extension**. Fsmonitor catches saves from anywhere; IDE extension provides early UI warnings before save. + +CLI surface (small, only ships once a backend is chosen): + +```bash +gx locks watch --branch # foreground daemon for one branch +gx locks watch --all # daemon for every owned branch +gx locks doctor # report which enforcement path is active +``` + +## Tier / effort + +- **Tier**: T3. +- **Effort**: multi-week. Includes a design doc, ADR on enforcement backend, prototype, cross-editor testing, fallback strategy. + +## Dependencies + +- **Hard on Gap 02** (Structured observability): every blocked-save must emit a `lock-violation-blocked` event so we can prove enforcement is working without re-instrumenting. +- **Soft on Gap 01** (Interactive recovery): `gx recover` should know how to read enforcement-violation events. + +## Open questions + +- Do we ship a watchman/fsmonitor dependency, or pure-JS inotify (`chokidar`)? `chokidar` keeps the install surface npm-only; watchman gets perf at the cost of platform install instructions. +- Hard-block (refuse save) vs. soft-warn (allow save, but flag in IDE and in pre-commit)? Lean **soft-warn first**, hard-block behind an opt-in flag, escalate to default-hard after one quarter of telemetry. +- Single-user enforcement (claim by username) vs. branch-scoped (claim by branch)? Today it is branch-scoped — keep that. + +## Acceptance criteria (deferred until backend chosen) + +- [ ] ADR written describing the chosen enforcement backend and rejected alternatives. +- [ ] Prototype demonstrates soft-warn for a save by a non-owner process in at least one editor. +- [ ] `gx locks doctor` reports which enforcement backend is active and whether it is healthy. +- [ ] Pre-commit hook continues to function as the last-line defense and is not removed. +- [ ] Documentation in `openspec/specs/multiagent-safety/context.md` explains the new layered model. + +## Why this is deferred + +This is the gap most prone to over-engineering. Do not start until at least Gap 01–04 have shipped and there is concrete evidence (event-log data from Gap 02) that lock-violation collisions are actually happening at edit time, not just being theoretically possible. diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/06-per-remote-trust-policy.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/06-per-remote-trust-policy.md new file mode 100644 index 0000000..5d8c7b1 --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/06-per-remote-trust-policy.md @@ -0,0 +1,60 @@ +# Gap 06 — Per-remote trust policy + +## Problem + +When a `gx branch finish` flow pushes a parent repo whose submodules live in a different GitHub org (e.g. parent repo on `recodeee/`, submodule on `Webu-PRO/lifted.sk-storefront`), the Codex/Claude host approval policy blocks the submodule push every time. gx has no way to declare "this remote pattern is trusted for this operation". The choice today is binary: approve every push at the host level (wide blast radius), or get blocked every time (high friction). + +## Evidence in current code + +- Session memory S715–S716 (2026-05-11 12:38, 12:41): user explicitly hit this issue with `git@github.com:Webu-PRO/lifted.sk-storefront.git` during a `gx branch finish` flow. +- `src/finish/index.js` calls `git push` for submodules without consulting any local allowlist. +- No `.gx/trust.json` or equivalent config exists in `src/context.js`. +- Codex external-approval boundary is documented in `CLAUDE.md` under "External approval boundary" — gx is required to either request narrow approval or stay blocked. + +## Proposed CLI surface + +```bash +gx config trust add # add a remote pattern (glob or regex) to the trust list +gx config trust list # show current trust list +gx config trust remove # remove +gx config trust test # exit 0 if the URL matches a trust entry +``` + +Persisted at repo-root `.gx/trust.json`: + +```json +{ + "schemaVersion": 1, + "remotes": [ + { "pattern": "git@github.com:Webu-PRO/*.git", "scopes": ["submodule-push"] }, + { "pattern": "https://github.com/recodeee/*.git", "scopes": ["push", "submodule-push", "pr-create"] } + ] +} +``` + +When `gx branch finish` is about to push to a remote, it consults the trust list. A trusted remote means gx pre-emits a structured "trusted push" event so the host (Codex/Claude) approval prompt can route to a narrower approval path. **gx does not bypass host approval** — it only annotates so external tooling can make a better call. + +## Tier / effort + +- **Tier**: T1 (≤ 5 files, single capability, no behavior change to existing flows when no trust list is configured). +- **Effort**: ~5 files / ~half day. New `src/trust/index.js` + subcommands under `gx config` + integration call in `src/finish/index.js` + tests. + +## Dependencies + +None. Soft pair with Gap 02 (events): the "trusted push" annotation should appear in the event log. + +## Open questions + +- Glob vs. regex pattern syntax? Lean **glob** (`fnmatch`-style) for simplicity; regex is a follow-up. +- Where does the file live: `.gx/trust.json` (per-repo) or `~/.gx/trust.json` (per-user)? Lean **both**, with repo-local overriding user-level. +- Should `gx config trust add` warn loudly when adding wildcards (`*` at start of pattern)? Yes. +- Interaction with host approval: gx **must not** present this as a way to silently auto-approve. The trust list is a *hint*, not an override. + +## Acceptance criteria + +- [ ] `gx config trust add ` writes to `.gx/trust.json` with schema version 1. +- [ ] `gx config trust list` prints the current entries with pattern, scopes, source (repo vs. user). +- [ ] `gx config trust test ` exits 0 on match, 1 on miss, prints the matched pattern. +- [ ] `gx branch finish` consults the trust list and emits a structured annotation when pushing to a trusted remote. +- [ ] gx never bypasses host approval; it only annotates. Document this loudly in the trust subcommand help text. +- [ ] Regression test: with empty trust list, behavior is identical to today. diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/07-main-js-refactor.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/07-main-js-refactor.md new file mode 100644 index 0000000..2a566c2 --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/gaps/07-main-js-refactor.md @@ -0,0 +1,62 @@ +# Gap 07 — `src/cli/main.js` refactor + +## Problem + +`src/cli/main.js` is 125 KB. It contains a hand-rolled `if (command === 'X') return Y(rest)` cascade for every top-level subcommand, alongside dozens of helper functions, constants imports, and command implementations co-located in the same file. Adding any new top-level verb (`gx recover`, `gx events`, `gx resolve`, `gx config`) bolts more lines onto the same dispatch. + +This is not yet broken — but the next 3 features from this roadmap (#01 recover, #02 events, #04 resolve, #06 config trust) each add a dispatch entry plus inline glue plus arg-parsing. Without intervention, `main.js` will cross 150 KB this quarter and become the single biggest review bottleneck. + +## Evidence in current code + +- `src/cli/main.js` size: ~125 KB (memory 4815, confirmed via `ls -la` in this session). +- Dispatch lines `src/cli/main.js:3911–3971` form a flat `if (command === '...')` chain with no extensibility hook. +- `src/cli/args.js` is 31 KB and parses every command's args by hand; same growth pattern. +- Existing partial extraction: `src/cli/dispatch.js` exists at only 2 KB — the seed for a registry is there but unused at the top level. + +## Proposed approach + +This is a refactor, not a feature. The proposed sequence: + +1. Introduce a `CommandRegistry` in `src/cli/dispatch.js` (already a 2 KB seed file). +2. Each top-level verb registers itself: `registry.register({ name, aliases, help, parseArgs, run })`. +3. Move one verb at a time out of `main.js` into its own module under `src/cli/commands/.js`. +4. Migration order, smallest-risk-first: `version`, `prompt`, `pr-review`, `protect`, `sync`, `release`, `report`, `migrate`, `pivot`, `ship`, `hook`, `install-agent-skills`, `worktree`, `merge`, `cleanup`, `agents`, `cockpit`, `submodule`, `locks`, `doctor`, `branch`, `finish`, `setup`/`install`/`fix`/`scan`/`status` (the cluster). Each migration is its own PR with regression coverage. +5. Final `main.js` becomes ~5 KB: import the registry, dispatch, top-level error handling. + +## Tier / effort + +- **Tier**: T3 (cross-cutting, multi-PR). +- **Effort**: multi-week. ~1 PR per verb migration; ~25 verbs visible in the current cascade. Each PR small (one verb), but the total span is significant. + +## Dependencies + +- **Hard on Gap 01** (`gx recover`) and **Gap 03** (`--stranded` filter) shipping first. Reasoning: those two PRs prove the existing dispatch is still expressive enough; only after they land do we know whether the registry refactor is needed or premature. +- **Hard on stable test coverage**. The refactor must not regress any verb. Either snapshot tests for every CLI surface, or a comprehensive integration-test suite, must exist before starting verb migration PR #1. + +## Why this is deferred (loudly) + +`main.js` being 125 KB is a smell, not yet a bug. The refactor: + +- Is the second-most likely thing to introduce regressions in `gx` itself. +- Provides zero new user value (no new verb, no new flag, no new safety). +- Locks the repo into review-heavy PRs for weeks. + +Do **not** start this refactor until: + +1. At least 3 of the other 6 gaps have shipped, **and** +2. The pain of bolting on the next feature has been concretely felt by an agent (PR comment, blocker handoff, or readability complaint), **not** anticipated. + +If neither condition holds, this gap stays open as documentation, not as a backlog item. + +## Acceptance criteria (only meaningful once kicked off) + +- [ ] ADR written justifying the registry pattern and the chosen migration order. +- [ ] `src/cli/dispatch.js` exposes a `CommandRegistry` API. +- [ ] First verb migration PR (smallest verb) is reviewable in < 200 LOC of diff. +- [ ] Regression test asserts every pre-refactor verb still works identically post-refactor (snapshot of help text + a smoke invocation per verb). +- [ ] No public surface change visible to users between refactor PRs. +- [ ] Final `main.js` (post-migration) is < 10 KB. + +## Caveat + +The 125 KB number is a measurement, not a target. If `main.js` shrinks for organic reasons (verbs that get extracted as part of feature work), this gap may close itself without a dedicated refactor sprint. Re-evaluate before starting. diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/proposal.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/proposal.md new file mode 100644 index 0000000..92e2215 --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/proposal.md @@ -0,0 +1,34 @@ +## Why + +Seven distinct gaps in gitguardex (`gx`) surfaced during a session-level conversation on 2026-05-11. Each is large enough to deserve its own future change, but small enough to ship independently. Without a written backlog, they will either be forgotten or get re-discovered next session at full cost. + +The roadmap converts the conversation into a reviewable, prioritized list of future changes so the user can: + +- See all 7 gaps in one place with tier (`T1`/`T2`/`T3`), effort, dependencies, and current state of the world. +- Skim each gap's `gaps/NN-*.md` and decide which to fund next session (or hand off). +- Reuse the gap docs as the proposal seed for a real T1/T2 change when work starts. + +## What Changes + +This change is **documentation-only** and ships no source code or CLI surface changes. + +It adds: + +- `roadmap.md` — priority-sorted index of all 7 gaps with one-line summary per row. +- `gaps/01-interactive-recovery.md` +- `gaps/02-structured-observability.md` +- `gaps/03-stranded-lane-inventory.md` +- `gaps/04-conflict-resolution-verb.md` +- `gaps/05-cross-process-lock-enforcement.md` +- `gaps/06-per-remote-trust-policy.md` +- `gaps/07-main-js-refactor.md` + +Each gap doc uses a consistent template (problem, evidence in current code, proposed CLI surface, tier, effort, dependencies, open questions, acceptance criteria) so it can drop straight into a future change's `proposal.md` with minimal rewrite. + +## Impact + +- **Surfaces affected**: `openspec/changes/` only. No `src/`, no `scripts/`, no `bin/`, no `.claude/`, no package version bump. +- **Risk**: zero behavioral risk. Docs do not execute. +- **Rollout**: merge to `main` immediately; gaps become candidate proposals for future agent sessions. +- **Follow-ups**: each gap doc lists its own dependencies. Gaps #1, #3, #6 are independently shippable; gaps #4 and #5 depend on #2 (structured observability) for evidence. Gap #7 (refactor) should be deferred until at least one of the new verbs lands and pressure on `main.js` is real. +- **Not done here**: no source code, no tests, no `tasks.md` for the future gaps — those each get a fresh change folder when their work starts. diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/roadmap.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/roadmap.md new file mode 100644 index 0000000..03aa8e9 --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/roadmap.md @@ -0,0 +1,38 @@ +# gx gap roadmap — 2026 Q2 + +Source conversation: 2026-05-11 session where the question "what is missing in gitguardex" surfaced seven distinct gaps. This document is the prioritized index. Each row links to a `gaps/NN-*.md` proposal seed that a future change can lift verbatim. + +## Priority order + +Priority ranks combine **user impact × implementation tractability × dependency depth**. P1 ships first. + +| # | Title | Tier | Effort | Priority | Deps | One-line problem | +|----|----------------------------------------------------------------------|------|--------|----------|--------|------------------| +| 01 | [Interactive recovery verb](./gaps/01-interactive-recovery.md) | T2 | ~6 files / ~1 day | P1 | none | `agent-autofinish-watch.sh --auto-merge` is brave but blind; no `gx recover ` that surfaces *why* a lane stalled. | +| 03 | [Stranded-lane inventory](./gaps/03-stranded-lane-inventory.md) | T1 | ~3 files / ~half day | P1 | none | `gx agents` shows lanes; no `--stranded` / `--age >Nm` filter despite 3 stranded codex lanes visible in attention-inbox right now. | +| 02 | [Structured observability](./gaps/02-structured-observability.md) | T2 | ~10 files / ~1.5 days | P2 | none | `gx agents --json` is hidden behind a subcommand; no append-only event log; planner UI cannot stream gx state. | +| 06 | [Per-remote trust policy](./gaps/06-per-remote-trust-policy.md) | T1 | ~5 files / ~half day | P2 | none | Codex approval policy is all-or-nothing per host; submodule pushes to trusted external remotes (e.g. `Webu-PRO/lifted.sk-storefront`) block every time. | +| 04 | [Conflict resolution verb](./gaps/04-conflict-resolution-verb.md) | T2 | ~8 files / ~2 days | P3 | 02 | `submodule-pointer-conflict-resolver` worktree has been sitting idle; no `gx resolve` primitive — agents drop to raw git for the recurring submodule/lockfile/generated-file pattern. | +| 05 | [Cross-process lock enforcement](./gaps/05-cross-process-lock-enforcement.md) | T3 | multi-week | P4 | 02 | `gx locks claim` writes a manifest but nothing physically prevents an editor in another worktree from saving the file; pre-commit catches it, but late. | +| 07 | [`src/cli/main.js` refactor](./gaps/07-main-js-refactor.md) | T3 | multi-week | P5 | 01, 03 | 125 KB file with hand-rolled `if (command === 'X')` cascade; next 3 features will keep growing it until it forks. Defer until at least one new verb has actually landed and pain is real. | + +## Recommended sequencing + +1. **Wave 1 (immediate, independent):** #01 Interactive recovery + #03 Stranded-lane filter. They wrap existing primitives, can ship in a single small PR if scoped together. +2. **Wave 2 (foundation for later gaps):** #02 Structured observability. Unlocks #04 and #05 by giving them an event stream and machine-readable lane data. +3. **Wave 3 (policy):** #06 Per-remote trust. Independent of #02; can run in parallel with Wave 2. +4. **Wave 4 (behavior):** #04 Conflict resolution. Needs design; do not rush. +5. **Wave 5 (defer):** #05 Lock enforcement and #07 main.js refactor are multi-week. Do not start until at least #01–#04 have shipped and pressure on the refactor is concrete. + +## What is **not** on this roadmap + +These were considered and intentionally excluded so future agents do not re-propose them: + +- Adding more agent worktrees by default — the current 9-worktree pile is already too many; the answer is better recovery (#01) and lane hygiene, not more lanes. +- Replacing OpenSpec — change-driven workflow is working; gaps are in `gx` itself, not in the spec tool. +- Replacing Colony — coordination layer is out of scope for this repo. +- Renaming `multiagent-safety.js` to `gx.js` — cosmetic; not user-visible. + +## Quota note + +This roadmap was authored at 100 % weekly quota, so each gap doc deliberately stays short (< 100 lines) and front-loads the acceptance criteria so a fresh-quota session can read once and start implementing. diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/specs/gx-gaps-roadmap-2026q2/spec.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/specs/gx-gaps-roadmap-2026q2/spec.md new file mode 100644 index 0000000..40a17aa --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/specs/gx-gaps-roadmap-2026q2/spec.md @@ -0,0 +1,30 @@ +## ADDED Requirements + +### Requirement: gx Q2 2026 gap roadmap deliverable +The system SHALL ship a documentation-only roadmap that converts the seven 2026-05-11 gx gap conversations into reviewable, independently-pickable proposal seeds for future changes. + +The roadmap MUST contain: + +- A `roadmap.md` index file inside the change folder that lists every gap with: gap number, title, tier (`T1`/`T2`/`T3`), effort estimate, dependencies on other gaps, and a one-line problem statement. +- A `gaps/NN-.md` file for each of the seven gaps using a consistent template (Problem, Evidence in current code, Proposed CLI surface, Tier, Effort, Dependencies, Open questions, Acceptance criteria). +- Exactly seven gap files, numbered `01` through `07`, covering: interactive recovery verb, structured observability surface, stranded-lane filter, conflict-resolution verb, cross-process lock enforcement, per-remote trust policy, and `src/cli/main.js` refactor. + +The roadmap MUST NOT modify any file outside `openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/` (no `src/`, no `scripts/`, no `bin/`, no `package.json` bump, no `CHANGELOG.md`). + +#### Scenario: Index lists all seven gaps +- **WHEN** a reader opens `roadmap.md` +- **THEN** they see a single table or ordered list with exactly seven rows +- **AND** each row links to its corresponding `gaps/NN-.md` +- **AND** each row shows tier, effort, and dependencies inline. + +#### Scenario: Each gap doc is a future-proposal seed +- **WHEN** a reader opens any `gaps/NN-.md` +- **THEN** the file follows the consistent template +- **AND** the Problem section cites at least one piece of concrete evidence (file path, command output, attention-inbox state, or merged-PR reference) +- **AND** the Proposed CLI surface section names the exact subcommand or flag introduced +- **AND** the Acceptance criteria section is specific enough to be lifted into a future change's `tasks.md`. + +#### Scenario: Roadmap is docs-only +- **WHEN** `git diff --name-only main...HEAD` is inspected after the change is committed +- **THEN** every changed path is rooted under `openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/` +- **AND** no files under `src/`, `scripts/`, `bin/`, `templates/`, `.claude/`, or `package.json` appear. diff --git a/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/tasks.md b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/tasks.md new file mode 100644 index 0000000..30af863 --- /dev/null +++ b/openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/tasks.md @@ -0,0 +1,41 @@ +## Definition of Done + +This change is complete only when **all** of the following are true: + +- Every checkbox below is checked. +- The agent branch reaches `MERGED` state on `origin` and the PR URL + state are recorded in the completion handoff. +- If any step blocks (test failure, conflict, ambiguous result), append a `BLOCKED:` line under section 4 explaining the blocker and **STOP**. Do not tick remaining cleanup boxes; do not silently skip the cleanup pipeline. + +## Handoff + +- Handoff: change=`agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08`; branch=`agent/claude/gx-gaps-roadmap-2026q2-2026-05-11-16-08`; scope=`docs-only roadmap of 7 gap analyses for future gx changes`; action=`continue this sandbox or finish cleanup after a usage-limit/manual takeover`. +- Copy prompt: Continue `agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08` on branch `agent/claude/gx-gaps-roadmap-2026q2-2026-05-11-16-08`. Work inside the existing sandbox, review `openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/tasks.md`, continue from the current state instead of creating a new sandbox, and when the work is done run `gx branch finish --branch agent/claude/gx-gaps-roadmap-2026q2-2026-05-11-16-08 --base main --via-pr --wait-for-merge --cleanup`. + +## 1. Specification + +- [x] 1.1 Finalize proposal scope and acceptance criteria for `agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08` (proposal.md rewritten as docs-only roadmap shipping `roadmap.md` + `gaps/01..07.md`). +- [x] 1.2 Define normative requirements in `specs/gx-gaps-roadmap-2026q2/spec.md` (7 named gap documents required; consistent template; no source code touched). + +## 2. Implementation + +- [x] 2.1 Write `roadmap.md` index with priority-sorted table covering all 7 gaps. +- [x] 2.2 Write `gaps/01-interactive-recovery.md` (recovery verb wrapping autofinish-watch primitives). +- [x] 2.3 Write `gaps/02-structured-observability.md` (events log + flat `gx agents --json` surface). +- [x] 2.4 Write `gaps/03-stranded-lane-inventory.md` (`--stranded` filter on agents listing). +- [x] 2.5 Write `gaps/04-conflict-resolution-verb.md` (`gx resolve` for submodule/lockfile/generated-file collisions). +- [x] 2.6 Write `gaps/05-cross-process-lock-enforcement.md` (editor-layer enforcement beyond pre-commit). +- [x] 2.7 Write `gaps/06-per-remote-trust-policy.md` (per-remote allowlist for finish/push approval policy). +- [x] 2.8 Write `gaps/07-main-js-refactor.md` (split 125K `src/cli/main.js` into subcommand registry). +- [x] 2.9 No source-code edits in this change (deliberate; docs-only). + +## 3. Verification + +- [x] 3.1 Run targeted project verification: `openspec validate agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08 --type change --strict` — passed ("Change ... is valid"). +- [x] 3.2 Run `openspec validate --specs` — passed ("No items found to validate" since no main-spec deltas). +- [x] 3.3 Confirm `git status` inside the worktree only shows files under `openspec/changes/agent-claude-gx-gaps-roadmap-2026q2-2026-05-11-16-08/` — confirmed, 12 files all rooted there. + +## 4. Cleanup (mandatory; run before claiming completion) + +- [ ] 4.1 Run the cleanup pipeline: `gx branch finish --branch agent/claude/gx-gaps-roadmap-2026q2-2026-05-11-16-08 --base main --via-pr --wait-for-merge --cleanup`. This handles commit -> push -> PR create -> merge wait -> worktree prune in one invocation. +- [ ] 4.2 Record the PR URL and final merge state (`MERGED`) in the completion handoff. +- [ ] 4.3 Confirm the sandbox worktree is gone (`git worktree list` no longer shows the agent path; `git branch -a` shows no surviving local/remote refs for the branch).