Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-05-11
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Gap 01 — Interactive recovery verb (`gx recover`)

## Problem

When an agent lane stalls, the current answer is `scripts/agent-autofinish-watch.sh --auto-merge`, which after 15 minutes of file silence commits whatever is in the worktree, pushes, and tries to merge. That is a daemon for the *common* case (idle worktree, clean intent). It is the wrong tool when the lane is actually broken: dirty submodule, missing lock claim, unmerged PR check failures, raced base-branch push, or a half-applied edit the agent meant to revert.

There is no command that lets a human (or a fresh agent) ask: "this branch is stuck — tell me *why*, and propose the narrow fix."

## Evidence in current code

- `scripts/agent-autofinish-watch.sh` (commit-then-push pipeline, idle-driven).
- `bin/agent-stalled-report.sh` (wired as `SessionStart` hook — emits one-line summary per stalled branch).
- `src/agents/status.js` already builds rich `Session` records via `buildAgentsStatusPayload(repoRoot)` including `worktreeExists`, `lockCount`, `claimedFiles`, `changedFiles`, `prUrl`, `prState`.
- Attention inbox at session start: 3 codex lanes stalled at 9 m, 14 m, 33 m old — recurring real-world signal.

## Proposed CLI surface

```bash
gx recover <branch> # diagnostic mode: print causes, no actions
gx recover <branch> --apply # take the safest single recommended action
gx recover <branch> --dry-run # equivalent to bare invocation (default)
gx recover --all # diagnose every stranded branch
gx recover <branch> --json # machine-readable for cockpit / dashboard
```

Diagnostic output buckets the lane into one of:

- `clean-idle` → recommend `gx branch finish ... --via-pr --wait-for-merge --cleanup`.
- `dirty-uncommitted` → recommend `git -C <wt> commit -am "wip recover"` then finish.
- `unclaimed-files` → list the unclaimed paths, recommend `gx locks claim ...`.
- `submodule-pointer-drift` → recommend `gx submodule advance` inside the worktree.
- `pr-open-checks-red` → link the PR + failing check name.
- `worktree-missing` → recommend `git worktree prune` + branch deletion if no commits exist.
- `unknown` → dump the raw evidence and STOP (do not auto-apply).

## Tier / effort

- **Tier**: T2.
- **Effort**: ~6 files / ~1 day. New `src/recover/index.js` + dispatch entry in `src/cli/main.js` + arg parsing in `src/cli/args.js` + tests + manifest entry. Reuses `agents/status.js`, `git/index.js`, and `submoduleModule` primitives.

## Dependencies

None. Ships independently. Pairs well with Gap 03 (stranded filter) but does not depend on it.

## Open questions

- Should `gx recover --apply` run the finish pipeline directly, or print the exact command and let the user/agent paste? Lean **print**, since the recovery verb is meant to be diagnostic-first.
- Should it touch other agents' lanes (`--all`)? Probably yes, but with `--apply` refusing to act on lanes whose lock owner is not the current user/agent.
- Where do we surface unmerged-PR check failures? Likely via `gh pr checks` shell-out, gated on `gh` presence.

## Acceptance criteria

- [ ] `gx recover <branch>` exits 0 with a human-readable diagnosis for every state in the bucket list.
- [ ] `gx recover <branch> --json` returns `{ branch, state, evidence: {...}, recommended: { command: "...", reason: "..." } }`.
- [ ] `gx recover --all` iterates every `agent/*` branch (current-user-owned) and prints one diagnosis per lane.
- [ ] `--apply` performs the recommended action only for the `clean-idle`, `dirty-uncommitted`, and `unclaimed-files` states; refuses on the rest with an explanation.
- [ ] Regression test covers each bucket against a fixture worktree.
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Gap 02 — Structured observability

## Problem

`gx` has no first-class machine-readable surface. `gx agents --json` does not work as a flat flag (`gx agents --json` returns `Unknown agents subcommand: --json` on the current `main` branch). There is no append-only event log of lane lifecycle events (`branch-start`, `lock-claim`, `commit`, `push`, `pr-open`, `merge`, `cleanup`, `stall-detected`). External tools — the Colony planner UI, cockpit, dashboards, scripts that want to watch CI — fall back to parsing human-readable text or scanning `.omc/agent-worktrees/` directly.

The cost shows up two places: planner cards do not refresh on real state changes, and stalled-lane recovery (Gap 01) has to re-derive history every time instead of reading an append-only stream.

## Evidence in current code

- `src/agents/status.js:86 buildAgentsStatusPayload()` already returns a structured `{ schemaVersion: 1, repoRoot, sessions: [...] }` payload — the data exists, just not exposed at the top-level CLI.
- `src/agents/status.js:110 renderAgentsStatus(payload, { json: true })` returns the JSON string; nothing calls it from `gx agents` directly.
- `src/cli/main.js:2692 function agents(rawArgs)` dispatches subcommands only.
- Repeated hand-grepping of `.omc/agent-worktrees/*/manifest.json` in `scripts/agent-autofinish-watch.sh` and `bin/agent-stalled-report.sh`.

## Proposed CLI surface

```bash
gx agents --json # flat surface; same payload as today's subcommand
gx status --json # repo-level health (delegates to scan + agents)
gx events tail [--since=15m] [--branch=...] # stream of NDJSON events
gx events log [--last=100] [--branch=...] # bounded historical query
```

Event log shape (NDJSON, one event per line, append-only file at `.omc/events.ndjson`):

```json
{"ts":"2026-05-11T16:08:00Z","kind":"branch-start","branch":"agent/claude/...","agent":"claude-opus","tier":"T2","worktree":".omc/agent-worktrees/..."}
{"ts":"2026-05-11T16:09:14Z","kind":"lock-claim","branch":"agent/claude/...","files":["proposal.md","tasks.md"]}
{"ts":"2026-05-11T16:30:02Z","kind":"pr-open","branch":"agent/claude/...","prUrl":"https://github.com/recodeee/gitguardex/pull/NNN"}
{"ts":"2026-05-11T16:35:11Z","kind":"merge","branch":"agent/claude/...","sha":"abc1234"}
```

Writer is the `gx branch start/finish`, `gx locks claim`, and the finish pipeline. Events file is gitignored by default and rotated by size in a follow-up.

## Tier / effort

- **Tier**: T2.
- **Effort**: ~10 files / ~1.5 days. New `src/events/index.js` (append/tail/log helpers) + integration calls from `start.js`, `finish/index.js`, `locks` writers + flat `--json` dispatch on `gx agents` and `gx status` + tests.

## Dependencies

None to ship. Unblocks Gap 04 (`gx resolve` reads the event log to detect repeated collisions) and Gap 05 (lock enforcement layer needs an audit trail anyway).

## Open questions

- Append-only file vs. SQLite? Lean **NDJSON**: zero deps, easy to tail in bash, easy to rotate.
- Per-repo only, or also a user-level `~/.gx/events.ndjson` aggregation? Start per-repo; user-level is a follow-up.
- Schema versioning: every event MUST include `schemaVersion: 1`. Bumps require an entry in `roadmap.md`.

## Acceptance criteria

- [ ] `gx agents --json` (flat) returns the same payload as the legacy subcommand, exits 0.
- [ ] `gx status --json` returns `{ schemaVersion, repoRoot, doctor: {...}, agents: {...} }`.
- [ ] `gx events tail` streams new events as they are written; `--since=15m` filters.
- [ ] `gx events log --last=100` returns the most recent N events in newest-first order.
- [ ] All write paths (`branch start`, `branch finish`, `locks claim/release`, `cleanup`) emit events.
- [ ] `.gitignore` updated to exclude `.omc/events.ndjson` and any rotated `.omc/events-*.ndjson`.
- [ ] Regression test: spawn a fixture branch, run the full lifecycle, assert event sequence.
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Gap 03 — Stranded-lane inventory (`gx agents --stranded`)

## Problem

The data needed to list "lanes that have not made progress in N minutes" already exists in `src/agents/status.js`. What is missing is a filter and an explicit "stranded" classification. Today, to find them, you either read the attention inbox (which is Colony state, not gx state), or you eyeball `git worktree list` and infer from commit ages.

Concrete present-tense pain: this session's attention inbox shows 3 stranded codex lanes (9 m, 14 m, 33 m old). The Claude session-start hook (`scripts/agent-stalled-report.sh`) surfaces them but does not let you query, sort, or feed them to another tool.

## Evidence in current code

- `src/agents/status.js:86 buildAgentsStatusPayload()` already returns per-session `activity`, `worktreeExists`, `changedFiles`, `lockCount`. None of these are exposed as filter knobs.
- `scripts/agent-autofinish-watch.sh` has its own ad-hoc "stranded" definition based on file-mtime silence — duplicated logic.
- `bin/agent-stalled-report.sh` is a one-shot wrapper that does not accept filters.
- `gx agents` (no args) prints "Agent sessions: none" in the primary checkout; the data is per-worktree but the listing is not.

## Proposed CLI surface

```bash
gx agents # existing behavior, unchanged
gx agents --stranded # only lanes whose last activity > 15m ago
gx agents --stranded --age=30m # custom threshold
gx agents --stranded --json # machine-readable for cockpit / planner
gx agents --owned-by claude-opus # filter by agent (orthogonal to --stranded)
gx agents --tier T2 # filter by tier
gx agents --no-pr # lanes with no PR URL yet
```

The "stranded" definition codifies what `agent-autofinish-watch.sh` already does:

- last mtime of any file inside the worktree > `--age` (default 15 m), AND
- no merged PR, AND
- `worktreeExists` is true.

## Tier / effort

- **Tier**: T1 (≤ 5 files, single capability, no API/schema change).
- **Effort**: ~3 files / ~half day. New filter helpers in `src/agents/status.js` + new `--stranded` / `--age` / `--owned-by` / `--tier` / `--no-pr` flag parsing in `src/cli/args.js` (or wherever `agents` parses its rest) + tests.

## Dependencies

None. Independently shippable. Pairs naturally with Gap 01 (`gx recover --all` would internally call this filter).

## Open questions

- Default `--age` threshold: 15 m (matches `agent-autofinish-watch.sh --idle-seconds=900`) or shorter (5 m) for tighter feedback? Lean **15 m** to match existing semantics.
- Should `--stranded` exit non-zero when one or more lanes are stranded? Lean **yes**, so it composes into CI / shell pipelines (`gx agents --stranded || gx recover --all`).

## Acceptance criteria

- [ ] `gx agents --stranded` lists only lanes meeting the stranded criterion; empty output and exit 0 when none.
- [ ] `gx agents --stranded --json` returns the filtered payload with the same schema as the unfiltered `--json`.
- [ ] `--age=<duration>` accepts `Ns`, `Nm`, `Nh` and rejects invalid input with a clear error.
- [ ] `--owned-by`, `--tier`, `--no-pr` compose with `--stranded` via logical AND.
- [ ] `--stranded` exits non-zero (e.g. exit 2) when at least one stranded lane is found, so it works in shell guards.
- [ ] Regression test fixtures: zero lanes, one stranded lane, one active lane, mixed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Gap 04 — Conflict resolution verb (`gx resolve`)

## Problem

When two agent lanes both touch `.gitmodules`, a lockfile (`pnpm-lock.yaml`, `package-lock.json`, `Cargo.lock`), or a generated artifact (built CSS, generated OpenAPI client, schema dumps), the lanes collide at merge time. gx has no primitive for this. Agents drop to raw `git merge`, `git rebase --strategy-option=ours`, or `git checkout --theirs` — each of which can quietly destroy work in a sparse-checkout agent worktree.

The pattern is recurring and well-known internally: an existing `submodule-pointer-conflict-resolver` agent worktree has been sitting open with no commits for hours (visible in `git worktree list`), because the conflict-handling story is ambiguous enough that the agent declined to act.

## Evidence in current code

- `git worktree list` includes `agent/codex/...submodule-pointer-conflict-resolver` lanes with zero commits.
- `src/submodule/index.js` has `advance()` for forward-only pointer bumps but no merge-strategy support.
- `src/finish/index.js:241 finish()` calls `branchFinish` asset; no pre-merge conflict-resolution hook.
- Memory 5001: "submodule-pointer-conflict-resolver Worktree Is Freshly Created with No New Commits Yet" — observed pattern.

## Proposed CLI surface

```bash
gx resolve <path...> # inspect, print plan, no actions
gx resolve <path...> --strategy=<name> # apply strategy
gx resolve --auto # scan whole worktree, pick strategy per path
```

Strategies (each path-class has one default):

- `--strategy=submodule-tip` → for `.gitmodules` collisions: take the newer remote tip of every submodule. Refuses if either submodule has uncommitted work.
- `--strategy=lockfile-regen` → for lockfiles: delete, re-run the matching package manager (`pnpm install`, `npm install`, `cargo update`), commit the result.
- `--strategy=generated-rebuild` → for declared generated artifacts: delete, run the registered rebuild command (from `package.json` `scripts.<key>` or `.gx/resolve.json`), commit.
- `--strategy=ours` / `--strategy=theirs` → escape hatches; warn loudly.

`.gx/resolve.json` (new, optional) declares per-path strategies so `--auto` knows what to do without a flag:

```json
{
"rules": [
{ "path": "pnpm-lock.yaml", "strategy": "lockfile-regen", "command": "pnpm install --frozen-lockfile=false" },
{ "path": ".gitmodules", "strategy": "submodule-tip" },
{ "path": "apps/docs/openapi.json", "strategy": "generated-rebuild", "command": "pnpm --filter docs gen:openapi" }
]
}
```

## Tier / effort

- **Tier**: T2.
- **Effort**: ~8 files / ~2 days. New `src/resolve/index.js` + dispatch entry + `.gx/resolve.json` reader + per-strategy implementations + tests + docs entry in the relevant capability context.

## Dependencies

- **Soft on Gap 02** (Structured observability): `gx resolve` should emit `resolve-applied` events so repeat collisions are visible in `gx events log`. Ship without it if Gap 02 is not yet ready; backfill events later.
- Pre-commit hook needs an allowlist so `--strategy=lockfile-regen` commits do not trip "unclaimed files" guard mid-resolve.

## Open questions

- Should `gx resolve` operate only inside the agent worktree, or also on the primary checkout during a finish-time merge conflict? Lean **worktree-only**; primary stays read-only.
- Where does `gx resolve --auto` get its rule set when `.gx/resolve.json` is absent — built-in defaults only, or refuse and require explicit `--strategy`? Lean **built-in defaults for the three named path classes, refuse otherwise**.

## Acceptance criteria

- [ ] `gx resolve <path>` prints the chosen strategy and the exact commands it would run, exits 0, makes no changes.
- [ ] `gx resolve <path> --strategy=submodule-tip` updates `.gitmodules` pointers to the latest remote tip and commits when no submodule is dirty; refuses with a clear error otherwise.
- [ ] `gx resolve <path> --strategy=lockfile-regen` regenerates the lockfile via the registered command and commits the result.
- [ ] `gx resolve --auto` reads `.gx/resolve.json` (or built-in defaults) and resolves every collision in one pass.
- [ ] `--ours` / `--theirs` print a loud warning and still execute.
- [ ] Regression tests cover each strategy against a fixture worktree pair with synthesized collisions.
Loading
Loading