From 240d3362dbd43000a24b33a9b01c373cb8f08b70 Mon Sep 17 00:00:00 2001 From: screenleon Date: Thu, 2 Jul 2026 17:07:55 +0900 Subject: [PATCH 1/7] =?UTF-8?q?feat(CC-439):=20/ship=20command=20=E2=80=94?= =?UTF-8?q?=20implement=20to=20PR=20with=20a=20single=20pre-flight=20consi?= =?UTF-8?q?stency=20gate?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Encodes the existing memory-level autonomy rule as a runnable command: implement -> pr-gate -> fix -> re-gate until GO -> open PR, with the only stop point being a fundamental conflict with BACKLOG/DECISIONS. --- BACKLOG.md | 17 ++++++ commands/ship.md | 121 +++++++++++++++++++++++++++++++++++++++ scripts/test-commands.sh | 31 ++++++++++ 3 files changed, 169 insertions(+) create mode 100644 commands/ship.md diff --git a/BACKLOG.md b/BACKLOG.md index a55238d..f164803 100644 --- a/BACKLOG.md +++ b/BACKLOG.md @@ -75,6 +75,7 @@ CC-001/CC-002 were consumed by PR #24 fix bundle inline, with no standalone entr | CC-436 | 🔵 active | codex-host PreToolUse payload 驗證 probe(唯讀,驗證 CC-381 guard binding 可行性;umbrella: CC-333) | arch/install | 2026-07-02 | — | P2 | spike | | CC-437 | 🔵 active | doctor 擴充切片:host-aware capability check(`doctor.sh` 拆出 host module 介面;umbrella: CC-333,承接 CC-381) | arch/install | 2026-07-02 | — | P2 | design | | CC-438 | 🔵 active | host manifest schema v1:codex-host 設定面宣告化(`hosts/codex/host.yaml` + format handler;依賴 CC-436;umbrella: CC-333,承接 CC-381) | arch/install | 2026-07-02 | — | P2 | design | +| CC-439 | 🔵 active | `/ship ` command:明確票直接實作到開 PR,pre-flight 一致性檢查 + gate 迴圈收斂 | process/DX | 2026-07-02 | — | P2 | design | --- @@ -176,6 +177,22 @@ _Terminal_ (CC-378: swept OUT to `BACKLOG-ARCHIVE.md` by `scripts/archive-closed --- +## CC-439 — `/ship ` command:明確票直接實作到開 PR 🔵 active + +**Problem**: 目前「拿到明確 backlog 票 → 直接實作 → 派 pr-gate → 修到 GO → 開 PR」這條路徑,只存在於 memory 與 `agents/project-pm.md` 的 Rules A/B 散落文字裡,主線程每次都要自己記得拼起來完整流程,且完全沒有「開工前先檢查跟已定案決策有沒有衝突」這一步。 + +**Why**: 參考 [ai-night-shift](https://github.com/JudyaiLab/ai-night-shift) 的自動化紀律(非其架構):把「implement → gate → fix → PR」收斂成一個可重複呼叫的 command,讓「丟一張明確的票」到「開出 PR」變成單一動作;同時把唯一合法卡點(票跟 BACKLOG/DECISIONS 已定案內容根本性矛盾)做成明確、可執行的第一步檢查,而不是模糊的自我判斷。 + +**Requirement**: +- 新增 `commands/ship.md`(`/ship CC-NNN` 呼叫),依 `commands/pm.md`/`commands/spike.md` 既有格式撰寫,步驟:(0) pre-flight 一致性檢查:讀該票 `BACKLOG.md` body + grep `DECISIONS.md` `**Constraints introduced**`,若根本性矛盾或 `Dependencies` 未滿足,停止並回報,不開分支;(1) 開 `feat/CC-NNN` 分支;(2) 主線程直接 Read/Edit/Write 實作,不 dispatch codex 做實作;(3) gate 迴圈:`pmctl gate run --executor codex` → 讀 `Final:` → NO-GO 時交給 `project-pm` agent 依既有 Rule A/B synthesis → 修全部 finding → 重跑,直到 GO;停止條件只有兩種(根本性不一致、或 3-strike 審查後同批 diff-caused blocker 完全原地打轉);(4) `git push` + `gh pr create`(title/body 模板:票號/摘要/跑幾輪 gate/最終 verdict);(5) 收尾報告,GO 後不自動 merge。 +- `scripts/test-commands.sh` 補結構斷言:pre-flight 段落存在、gate 迴圈段落引用 `Final:`/`pmctl gate run --executor codex`、停止條件段落明確列出兩種且只有兩種、PR 模板段落存在。 +- 不新增 `open-pr.sh` 或 DECISIONS.md 解析腳本(一致性判斷是 LLM 語意工作,不做機械化);不建背景 daemon/cron supervisor(維持互動 session 內執行);不做批次掃描 BACKLOG 自動挑票。 + +**Dependencies**: 無阻塞依賴。 +**See**: — + +--- + ## CC-393 — design: portable-skill-substrate — CLI-agnostic skill 控制層 🟢 someday **Type**: design seed(想法捕捉;非 milestone 承諾) diff --git a/commands/ship.md b/commands/ship.md new file mode 100644 index 0000000..84f5636 --- /dev/null +++ b/commands/ship.md @@ -0,0 +1,121 @@ +--- +description: Take one explicit backlog ticket from implementation through pr-gate to an open PR, without stopping for step-by-step confirmation. +argument-hint: "" +--- + +Run a single, explicitly named ticket end-to-end: implement → gate → fix → gate +→ open PR. This is the main-thread's own default operating discipline made +runnable as one command, not a new background/unattended supervisor — the +session stays open and you stay reachable the whole time. + +**Scope**: one ticket per invocation, named in `$ARGUMENTS`. Do not scan +`BACKLOG.md` for other candidates or batch multiple tickets in one run. + +## The one legal stopping point + +Everywhere in this flow, the only reason to stop and ask the user instead of +continuing is: **the ticket's premise fundamentally conflicts with something +already decided** — its approach contradicts a `DECISIONS.md` entry's +`**Constraints introduced**`, a named `Dependencies` ticket is not actually +done, or (discovered mid-implementation) the ticket's own assumption turns +out to be wrong. Ordinary reviewer findings — hard gate or advisory, however +many rounds it takes — are not a stopping point; fix them and continue. This +mirrors `agents/project-pm.md`'s PR-gate verdict table and Rules A/B, which +this command invokes rather than re-implements. + +## Step 0 — Pre-flight consistency check + +Read the named ticket's full body from `BACKLOG.md` (`grep -n '^## '` +then `sed -n` the section — do not full-file Read). Extract `Problem` / +`Requirement` / `Dependencies`. + +- **Dependencies**: for every ticket referenced in `Dependencies`, confirm its + status in `BACKLOG.md`'s index table (or `BACKLOG-ARCHIVE.md` if terminal) + is actually a terminal state (`✅ done`/`✅ closed`) when the requirement + reads as a hard blocker. A dependency that is merely "related" (not a + blocker) does not stop the run. +- **Decisions**: `grep -n '\*\*Constraints introduced\*\*' DECISIONS.md` and + read the handful of entries whose `Context`/`Decision` mentions the same + subsystem, files, or concept as the ticket. Judge — as PM-level judgment, + not string matching — whether the ticket's `Requirement` asks for something + a constraint explicitly rules out, or whose premise a later decision has + superseded. `DECISIONS.md` is a human/PM-only audit record and stays that + way here: read it yourself, do not paste it into any dispatch brief. + +If either check finds a real conflict: **stop here.** Do not create a branch, +do not implement. Report the ticket id, the conflicting `DECISIONS.md` entry +or unmet dependency, and wait for the user's direction. + +If clear: continue to Step 1. + +## Step 1 — Branch + +`git status` first (never branch over uncommitted work silently — stash or +ask if the tree is dirty for unrelated reasons). Then: + +```bash +git checkout -b feat/ +``` + +## Step 2 — Implement + +Implement directly with Read/Edit/Write/Bash in this session — do not dispatch +implementation to codex/claude/opencode. `pmctl dispatch run` is not part of +this flow; the only executor dispatch in `/ship` is the gate's own reviewer +dispatch in Step 3. + +## Step 3 — Gate loop + +Invoke `/pr-gate` (Skill tool) for the review. Read the relayed `Final: +GO|NO-GO` verdict. + +- **GO** → go to Step 4. +- **NO-GO** → invoke the `project-pm` agent to synthesize the gate result + against the verdict table and Rules A/B in `agents/project-pm.md` (source-first + read of every cited diff file, discovery sweep of all call sites of a + flagged helper, minimum-list is a floor not a ceiling). Fix **every** finding + it returns — high, medium, and low, hard gate and advisory alike, not only + the blocking ones. Re-run `/pr-gate` with `--targeted ` for the + reviewers whose territory the fix touched. Repeat. + +**Stop the loop only when**: +1. Step 0's check would have caught this but didn't — implementation revealed + the ticket's own assumption is wrong, or a fix genuinely requires + contradicting a `DECISIONS.md` constraint; or +2. Rule A's 3-strike audit (`agents/project-pm.md`) has already run, the + remaining blockers are confirmed diff-caused (not the pre-existing issues + Rule A downgrades to `advise`+separate issue), and a further round produces + no new progress — same blockers, same state, nothing fixed. This is "no + fix is being found," not "this is taking many rounds": a real gate has + already needed 7 normal rounds to converge, and round count alone was not + a stop signal that time. + +Any other NO-GO, at any round count, gets fixed and re-gated without asking. + +## Step 4 — Open the PR + +```bash +git push -u origin feat/ +gh pr create --title "(): " --body "$(cat <<'EOF' +## Summary +- + +## Gate +- Rounds: +- Final verdict: GO +- Result file: + +Ticket: +EOF +)" +``` + +Do not merge. GO is not merge authorization — merge only when the user +explicitly says so. + +## Step 5 — Close-out report + +Whether Step 0 stopped the run or Step 4 opened a PR, report: ticket id, what +changed, how many gate rounds it took, the final verdict, and the PR URL (or, +if stopped, the exact conflict found and what decision is needed from the +user). diff --git a/scripts/test-commands.sh b/scripts/test-commands.sh index d84dd8d..a90aab0 100755 --- a/scripts/test-commands.sh +++ b/scripts/test-commands.sh @@ -369,6 +369,37 @@ should_run "using-git-worktrees: documents orphan recovery via gc" && assert_fil should_run "using-git-worktrees: excludes --parallel gate reviewer isolation from scope" && assert_file_contains "using-git-worktrees: excludes --parallel gate reviewer isolation from scope" "$USING_GIT_WORKTREES" "does not touch the \`--parallel\` PR gate" && pass "using-git-worktrees: excludes --parallel gate reviewer isolation from scope" should_run "using-git-worktrees: no CC ticket references" && assert_not_contains "using-git-worktrees: no CC ticket references" "$USING_GIT_WORKTREES" "CC-" +# ── ship.md contract ───────────────────────────────────────────────────────── + +SHIP="$COMMANDS_DIR/ship.md" + +assert_frontmatter "ship: frontmatter valid" "$SHIP" +should_run "ship: scoped to a single named ticket per invocation" && assert_file_contains "ship: scoped to a single named ticket per invocation" "$SHIP" "one ticket per invocation" && pass "ship: scoped to a single named ticket per invocation" +should_run "ship: does not batch-scan BACKLOG for candidates" && assert_file_contains "ship: does not batch-scan BACKLOG for candidates" "$SHIP" "Do not scan" && pass "ship: does not batch-scan BACKLOG for candidates" +# Step 0 pre-flight consistency check: the one legal stopping point +should_run "ship: has Step 0 pre-flight consistency check" && assert_file_contains "ship: has Step 0 pre-flight consistency check" "$SHIP" "Step 0" && pass "ship: has Step 0 pre-flight consistency check" +should_run "ship: checks DECISIONS.md Constraints introduced" && assert_file_contains "ship: checks DECISIONS.md Constraints introduced" "$SHIP" "Constraints introduced" && pass "ship: checks DECISIONS.md Constraints introduced" +should_run "ship: checks unmet Dependencies before starting" && assert_file_contains "ship: checks unmet Dependencies before starting" "$SHIP" "Dependencies" && pass "ship: checks unmet Dependencies before starting" +should_run "ship: conflict stops before branching or implementing" && assert_file_contains "ship: conflict stops before branching or implementing" "$SHIP" "Do not create a branch" && pass "ship: conflict stops before branching or implementing" +should_run "ship: keeps DECISIONS.md out of dispatch briefs" && assert_file_contains "ship: keeps DECISIONS.md out of dispatch briefs" "$SHIP" "do not paste it into any dispatch brief" && pass "ship: keeps DECISIONS.md out of dispatch briefs" +# implementation stays main-thread, not dispatched +should_run "ship: implementation is not dispatched to an executor" && assert_file_contains "ship: implementation is not dispatched to an executor" "$SHIP" "to codex/claude/opencode" && pass "ship: implementation is not dispatched to an executor" +# gate loop contract +should_run "ship: invokes /pr-gate for review" && assert_file_contains "ship: invokes /pr-gate for review" "$SHIP" "/pr-gate" && pass "ship: invokes /pr-gate for review" +should_run "ship: reads Final GO/NO-GO verdict" && assert_file_contains "ship: reads Final GO/NO-GO verdict" "$SHIP" "Final:" && pass "ship: reads Final GO/NO-GO verdict" +should_run "ship: NO-GO fixes every finding not only blocking ones" && assert_file_contains "ship: NO-GO fixes every finding not only blocking ones" "$SHIP" "the blocking ones" && pass "ship: NO-GO fixes every finding not only blocking ones" +should_run "ship: re-runs gate with targeted reviewers" && assert_file_contains "ship: re-runs gate with targeted reviewers" "$SHIP" "--targeted" && pass "ship: re-runs gate with targeted reviewers" +should_run "ship: references project-pm Rules A/B synthesis" && assert_file_contains "ship: references project-pm Rules A/B synthesis" "$SHIP" "Rules A/B" && pass "ship: references project-pm Rules A/B synthesis" +# exactly two stop conditions, no more +should_run "ship: stop condition heading enumerates the loop's halt cases" && assert_file_contains "ship: stop condition heading enumerates the loop's halt cases" "$SHIP" "Stop the loop only when" && pass "ship: stop condition heading enumerates the loop's halt cases" +should_run "ship: round count alone is not a stop signal" && assert_file_contains "ship: round count alone is not a stop signal" "$SHIP" "this is taking many rounds" && pass "ship: round count alone is not a stop signal" +should_run "ship: any other NO-GO continues without asking" && assert_file_contains "ship: any other NO-GO continues without asking" "$SHIP" "gets fixed and re-gated without asking" && pass "ship: any other NO-GO continues without asking" +# PR template +should_run "ship: opens PR via gh pr create" && assert_file_contains "ship: opens PR via gh pr create" "$SHIP" "gh pr create" && pass "ship: opens PR via gh pr create" +should_run "ship: PR body template records gate rounds and verdict" && assert_file_contains "ship: PR body template records gate rounds and verdict" "$SHIP" "Final verdict" && pass "ship: PR body template records gate rounds and verdict" +should_run "ship: GO is not merge authorization" && assert_file_contains "ship: GO is not merge authorization" "$SHIP" "GO is not merge authorization" && pass "ship: GO is not merge authorization" +should_run "ship: no CC ticket references" && assert_not_contains "ship: no CC ticket references" "$SHIP" "CC-[0-9]" + # ── summary ────────────────────────────────────────────────────────────────── th_summary From dcdb4fc3968355361b6a321b3a62b3d7e14c095e Mon Sep 17 00:00:00 2001 From: screenleon Date: Thu, 2 Jul 2026 17:13:08 +0900 Subject: [PATCH 2/7] fix(CC-439): ship gate step uses pmctl gate run --executor codex directly, count-check stop conditions Addresses gate-20260702-080805-deb544 critic block-soft + qa-tester block: the command and its tests referenced /pr-gate instead of the ticket's required pmctl gate run --executor codex invocation, and the stop-condition test only checked for presence, not an exact count of two. --- commands/ship.md | 13 +++++++++---- scripts/test-commands.sh | 13 +++++++++++-- 2 files changed, 20 insertions(+), 6 deletions(-) diff --git a/commands/ship.md b/commands/ship.md index 84f5636..d605305 100644 --- a/commands/ship.md +++ b/commands/ship.md @@ -66,8 +66,11 @@ dispatch in Step 3. ## Step 3 — Gate loop -Invoke `/pr-gate` (Skill tool) for the review. Read the relayed `Final: -GO|NO-GO` verdict. +Run `pmctl gate run --executor codex --cd "$PWD"` (the `/pr-gate` command is +the orchestration wrapper around this exact invocation — either entry point +is acceptable, but the underlying call is always `pmctl gate run --executor +codex`, never `bash scripts/pr-gate.sh` directly). Read the resulting `Final: +GO|NO-GO` verdict from the gate result file. - **GO** → go to Step 4. - **NO-GO** → invoke the `project-pm` agent to synthesize the gate result @@ -75,8 +78,10 @@ GO|NO-GO` verdict. read of every cited diff file, discovery sweep of all call sites of a flagged helper, minimum-list is a floor not a ceiling). Fix **every** finding it returns — high, medium, and low, hard gate and advisory alike, not only - the blocking ones. Re-run `/pr-gate` with `--targeted ` for the - reviewers whose territory the fix touched. Repeat. + the blocking ones. Re-run `pmctl gate run --executor codex --cd "$PWD" + --reviewers ` (the `/pr-gate` `--targeted` flag maps to this + same `--reviewers` option) for the reviewers whose territory the fix + touched. Repeat. **Stop the loop only when**: 1. Step 0's check would have caught this but didn't — implementation revealed diff --git a/scripts/test-commands.sh b/scripts/test-commands.sh index a90aab0..6aafa06 100755 --- a/scripts/test-commands.sh +++ b/scripts/test-commands.sh @@ -385,13 +385,22 @@ should_run "ship: keeps DECISIONS.md out of dispatch briefs" && assert_file_cont # implementation stays main-thread, not dispatched should_run "ship: implementation is not dispatched to an executor" && assert_file_contains "ship: implementation is not dispatched to an executor" "$SHIP" "to codex/claude/opencode" && pass "ship: implementation is not dispatched to an executor" # gate loop contract -should_run "ship: invokes /pr-gate for review" && assert_file_contains "ship: invokes /pr-gate for review" "$SHIP" "/pr-gate" && pass "ship: invokes /pr-gate for review" +should_run "ship: invokes pmctl gate run --executor codex for review" && assert_file_contains "ship: invokes pmctl gate run --executor codex for review" "$SHIP" "pmctl gate run --executor codex" && pass "ship: invokes pmctl gate run --executor codex for review" +should_run "ship: never invokes pr-gate.sh directly" && assert_file_contains "ship: never invokes pr-gate.sh directly" "$SHIP" "never \`bash scripts/pr-gate.sh\` directly" && pass "ship: never invokes pr-gate.sh directly" should_run "ship: reads Final GO/NO-GO verdict" && assert_file_contains "ship: reads Final GO/NO-GO verdict" "$SHIP" "Final:" && pass "ship: reads Final GO/NO-GO verdict" should_run "ship: NO-GO fixes every finding not only blocking ones" && assert_file_contains "ship: NO-GO fixes every finding not only blocking ones" "$SHIP" "the blocking ones" && pass "ship: NO-GO fixes every finding not only blocking ones" -should_run "ship: re-runs gate with targeted reviewers" && assert_file_contains "ship: re-runs gate with targeted reviewers" "$SHIP" "--targeted" && pass "ship: re-runs gate with targeted reviewers" +should_run "ship: re-runs gate with --reviewers targeting" && assert_file_contains "ship: re-runs gate with --reviewers targeting" "$SHIP" "--reviewers " && pass "ship: re-runs gate with --reviewers targeting" should_run "ship: references project-pm Rules A/B synthesis" && assert_file_contains "ship: references project-pm Rules A/B synthesis" "$SHIP" "Rules A/B" && pass "ship: references project-pm Rules A/B synthesis" # exactly two stop conditions, no more should_run "ship: stop condition heading enumerates the loop's halt cases" && assert_file_contains "ship: stop condition heading enumerates the loop's halt cases" "$SHIP" "Stop the loop only when" && pass "ship: stop condition heading enumerates the loop's halt cases" +if should_run "ship: stop-condition list has exactly two numbered cases"; then + ship_stop_count=$(grep -cE '^[0-9]+\. ' "$SHIP") + if [[ "$ship_stop_count" -eq 2 ]]; then + pass "ship: stop-condition list has exactly two numbered cases" + else + fail "ship: stop-condition list has exactly two numbered cases" "expected exactly 2 top-level numbered items, found $ship_stop_count in $SHIP" + fi +fi should_run "ship: round count alone is not a stop signal" && assert_file_contains "ship: round count alone is not a stop signal" "$SHIP" "this is taking many rounds" && pass "ship: round count alone is not a stop signal" should_run "ship: any other NO-GO continues without asking" && assert_file_contains "ship: any other NO-GO continues without asking" "$SHIP" "gets fixed and re-gated without asking" && pass "ship: any other NO-GO continues without asking" # PR template From 471bba2d4171bc5b964a8a3e98d24f09d33f2b13 Mon Sep 17 00:00:00 2001 From: screenleon Date: Thu, 2 Jul 2026 17:18:05 +0900 Subject: [PATCH 3/7] fix(CC-439): ship gate calls use --lifecycle foreground, tests enforce pairing Addresses gate-20260702-081316-817f1c critic block-soft + qa-tester block: default pmctl gate run is detached and returns only a gate_id, so reading Final: immediately after would read a stale/missing result. /ship has no other work to interleave, so run foreground instead of the detached+wait two-call dance /pr-gate uses to keep the main thread free. --- commands/ship.md | 24 ++++++++++++++++-------- scripts/test-commands.sh | 11 +++++++++++ 2 files changed, 27 insertions(+), 8 deletions(-) diff --git a/commands/ship.md b/commands/ship.md index d605305..ec606b2 100644 --- a/commands/ship.md +++ b/commands/ship.md @@ -66,11 +66,19 @@ dispatch in Step 3. ## Step 3 — Gate loop -Run `pmctl gate run --executor codex --cd "$PWD"` (the `/pr-gate` command is -the orchestration wrapper around this exact invocation — either entry point -is acceptable, but the underlying call is always `pmctl gate run --executor -codex`, never `bash scripts/pr-gate.sh` directly). Read the resulting `Final: -GO|NO-GO` verdict from the gate result file. +Run `pmctl gate run --executor codex --cd "$PWD" --lifecycle foreground` (the +`/pr-gate` command is the orchestration wrapper around this exact invocation +— either entry point is acceptable, but the underlying call is always +`pmctl gate run --executor codex`, never `bash scripts/pr-gate.sh` directly). +`--lifecycle foreground` is required here: the default `--lifecycle detached` +returns only a `gate_id` immediately and the gate keeps running in the +background — reading `Final:` right after that call would read a stale or +missing result. `/ship` is already a long-running autonomous loop with +nothing else for the main thread to do while it waits, so there is no reason +to pay the detached/`pmctl gate wait` two-call complexity that `/pr-gate` +uses to keep the main thread free for other work; run `foreground` and read +the resulting `Final: GO|NO-GO` verdict directly from the gate result file +once the call returns. - **GO** → go to Step 4. - **NO-GO** → invoke the `project-pm` agent to synthesize the gate result @@ -79,9 +87,9 @@ GO|NO-GO` verdict from the gate result file. flagged helper, minimum-list is a floor not a ceiling). Fix **every** finding it returns — high, medium, and low, hard gate and advisory alike, not only the blocking ones. Re-run `pmctl gate run --executor codex --cd "$PWD" - --reviewers ` (the `/pr-gate` `--targeted` flag maps to this - same `--reviewers` option) for the reviewers whose territory the fix - touched. Repeat. + --lifecycle foreground --reviewers ` (the `/pr-gate` + `--targeted` flag maps to this same `--reviewers` option) for the reviewers + whose territory the fix touched. Repeat. **Stop the loop only when**: 1. Step 0's check would have caught this but didn't — implementation revealed diff --git a/scripts/test-commands.sh b/scripts/test-commands.sh index 6aafa06..ebe894a 100755 --- a/scripts/test-commands.sh +++ b/scripts/test-commands.sh @@ -387,6 +387,17 @@ should_run "ship: implementation is not dispatched to an executor" && assert_fil # gate loop contract should_run "ship: invokes pmctl gate run --executor codex for review" && assert_file_contains "ship: invokes pmctl gate run --executor codex for review" "$SHIP" "pmctl gate run --executor codex" && pass "ship: invokes pmctl gate run --executor codex for review" should_run "ship: never invokes pr-gate.sh directly" && assert_file_contains "ship: never invokes pr-gate.sh directly" "$SHIP" "never \`bash scripts/pr-gate.sh\` directly" && pass "ship: never invokes pr-gate.sh directly" +if should_run "ship: every gate invocation uses --lifecycle foreground"; then + ship_flat=$(tr '\n' ' ' < "$SHIP" | tr -s ' ') + ship_gate_calls=$(grep -oE 'pmctl gate run --executor codex --cd "\$PWD"' <<< "$ship_flat" | wc -l) + ship_foreground_calls=$(grep -oE 'pmctl gate run --executor codex --cd "\$PWD"[^`]*--lifecycle foreground' <<< "$ship_flat" | wc -l) + if [[ "$ship_gate_calls" -gt 0 && "$ship_gate_calls" -eq "$ship_foreground_calls" ]]; then + pass "ship: every gate invocation uses --lifecycle foreground" + else + fail "ship: every gate invocation uses --lifecycle foreground" "found $ship_gate_calls gate run call(s) with --cd but only $ship_foreground_calls paired with --lifecycle foreground in $SHIP" + fi +fi +should_run "ship: explains why detached+wait is unnecessary here" && assert_file_contains "ship: explains why detached+wait is unnecessary here" "$SHIP" "nothing else for the main thread to do while it waits" && pass "ship: explains why detached+wait is unnecessary here" should_run "ship: reads Final GO/NO-GO verdict" && assert_file_contains "ship: reads Final GO/NO-GO verdict" "$SHIP" "Final:" && pass "ship: reads Final GO/NO-GO verdict" should_run "ship: NO-GO fixes every finding not only blocking ones" && assert_file_contains "ship: NO-GO fixes every finding not only blocking ones" "$SHIP" "the blocking ones" && pass "ship: NO-GO fixes every finding not only blocking ones" should_run "ship: re-runs gate with --reviewers targeting" && assert_file_contains "ship: re-runs gate with --reviewers targeting" "$SHIP" "--reviewers " && pass "ship: re-runs gate with --reviewers targeting" From 518e39a6516e82ceede36b5d3c41d293a9a79096 Mon Sep 17 00:00:00 2001 From: screenleon Date: Thu, 2 Jul 2026 17:25:33 +0900 Subject: [PATCH 4/7] fix(CC-439): ship validates ticket id, makes dirty-tree a fail-safe not an ask path Addresses gate-20260702-081820-a08344 critic advise + qa-tester block: - Step 0 now fails fast on empty/malformed/nonexistent ticket ids, checking both BACKLOG.md and BACKLOG-ARCHIVE.md, distinct from the negotiated "fundamental inconsistency" stop. - Step 1's dirty-tree handling is now a deterministic git stash -u, not a second ask-the-user path, so "the one legal stopping point" framing holds. - Tests count all gate invocations (not just --cd-qualified ones) and count the genuine wait-for-user-direction occurrences to catch a stray ask path. --- commands/ship.md | 67 ++++++++++++++++++++++++++-------------- scripts/test-commands.sh | 23 ++++++++++++-- 2 files changed, 64 insertions(+), 26 deletions(-) diff --git a/commands/ship.md b/commands/ship.md index ec606b2..66bc80f 100644 --- a/commands/ship.md +++ b/commands/ship.md @@ -13,21 +13,36 @@ session stays open and you stay reachable the whole time. ## The one legal stopping point -Everywhere in this flow, the only reason to stop and ask the user instead of -continuing is: **the ticket's premise fundamentally conflicts with something -already decided** — its approach contradicts a `DECISIONS.md` entry's -`**Constraints introduced**`, a named `Dependencies` ticket is not actually -done, or (discovered mid-implementation) the ticket's own assumption turns -out to be wrong. Ordinary reviewer findings — hard gate or advisory, however -many rounds it takes — are not a stopping point; fix them and continue. This -mirrors `agents/project-pm.md`'s PR-gate verdict table and Rules A/B, which -this command invokes rather than re-implements. - -## Step 0 — Pre-flight consistency check - -Read the named ticket's full body from `BACKLOG.md` (`grep -n '^## '` -then `sed -n` the section — do not full-file Read). Extract `Problem` / -`Requirement` / `Dependencies`. +Everywhere in this flow, the only reason to stop and ask the user for a +substantive discussion instead of continuing is: **the ticket's premise +fundamentally conflicts with something already decided** — its approach +contradicts a `DECISIONS.md` entry's `**Constraints introduced**`, a named +`Dependencies` ticket is not actually done, or (discovered mid-implementation) +the ticket's own assumption turns out to be wrong. Ordinary reviewer findings +— hard gate or advisory, however many rounds it takes — are not a stopping +point; fix them and continue. This mirrors `agents/project-pm.md`'s PR-gate +verdict table and Rules A/B, which this command invokes rather than +re-implements. + +This is distinct from Step 0's plain input validation and Step 1's dirty-tree +precondition below — those are deterministic fail-fast/fail-safe checks with +one predetermined outcome each, not a negotiated decision. "The one legal +stopping point" refers only to cases where the *next action is genuinely +ambiguous* and needs the user's judgment. + +## Step 0 — Validate the ticket id, then check consistency + +**Ticket-id validation** (fail fast, not a discussion point): if `$ARGUMENTS` +is empty, does not match this repo's ticket-id shape (`-` per +`pm/schema.md`), or has no matching `## ` heading in either +`BACKLOG.md` or `BACKLOG-ARCHIVE.md`, stop immediately and report the exact +problem (empty argument / malformed shape / no such ticket) — this is a plain +input error, resolved by the caller supplying a valid id, not something to +deliberate about. + +**Consistency check**: once the ticket id resolves, read its full body from +`BACKLOG.md` (`grep -n '^## '` then `sed -n` the section — do not +full-file Read). Extract `Problem` / `Requirement` / `Dependencies`. - **Dependencies**: for every ticket referenced in `Dependencies`, confirm its status in `BACKLOG.md`'s index table (or `BACKLOG-ARCHIVE.md` if terminal) @@ -50,8 +65,12 @@ If clear: continue to Step 1. ## Step 1 — Branch -`git status` first (never branch over uncommitted work silently — stash or -ask if the tree is dirty for unrelated reasons). Then: +**Dirty-tree precondition** (fail-safe, not a discussion point): run `git +status` first. If the tree is dirty with changes unrelated to this ticket, +`git stash -u` before branching (note the stash in the Step 5 report so it's +easy to recover) — never branch over uncommitted work silently, and never +stop to ask about it, since stashing is reversible and there is nothing to +deliberate. ```bash git checkout -b feat/ @@ -68,8 +87,8 @@ dispatch in Step 3. Run `pmctl gate run --executor codex --cd "$PWD" --lifecycle foreground` (the `/pr-gate` command is the orchestration wrapper around this exact invocation -— either entry point is acceptable, but the underlying call is always -`pmctl gate run --executor codex`, never `bash scripts/pr-gate.sh` directly). +— either entry point is acceptable, but the underlying gate call is always +this one, never `bash scripts/pr-gate.sh` directly). `--lifecycle foreground` is required here: the default `--lifecycle detached` returns only a `gate_id` immediately and the gate keeps running in the background — reading `Final:` right after that call would read a stale or @@ -128,7 +147,9 @@ explicitly says so. ## Step 5 — Close-out report -Whether Step 0 stopped the run or Step 4 opened a PR, report: ticket id, what -changed, how many gate rounds it took, the final verdict, and the PR URL (or, -if stopped, the exact conflict found and what decision is needed from the -user). +Report one of three outcomes: (1) invalid ticket id — the exact problem +(empty argument / malformed shape / no such ticket); (2) consistency-check +stop — the ticket id, the conflicting `DECISIONS.md` entry or unmet +dependency, and what decision is needed from the user; or (3) PR opened — +ticket id, what changed, how many gate rounds it took, the final verdict, the +PR URL, and whether Step 1 stashed pre-existing changes. diff --git a/scripts/test-commands.sh b/scripts/test-commands.sh index ebe894a..fb95fab 100755 --- a/scripts/test-commands.sh +++ b/scripts/test-commands.sh @@ -382,6 +382,15 @@ should_run "ship: checks DECISIONS.md Constraints introduced" && assert_file_con should_run "ship: checks unmet Dependencies before starting" && assert_file_contains "ship: checks unmet Dependencies before starting" "$SHIP" "Dependencies" && pass "ship: checks unmet Dependencies before starting" should_run "ship: conflict stops before branching or implementing" && assert_file_contains "ship: conflict stops before branching or implementing" "$SHIP" "Do not create a branch" && pass "ship: conflict stops before branching or implementing" should_run "ship: keeps DECISIONS.md out of dispatch briefs" && assert_file_contains "ship: keeps DECISIONS.md out of dispatch briefs" "$SHIP" "do not paste it into any dispatch brief" && pass "ship: keeps DECISIONS.md out of dispatch briefs" +# ticket-id validation: empty / malformed / nonexistent must fail fast, distinct from the discussion stop +should_run "ship: validates ticket id before any other step" && assert_file_contains "ship: validates ticket id before any other step" "$SHIP" "Ticket-id validation" && pass "ship: validates ticket id before any other step" +should_run "ship: handles empty argument" && assert_file_contains "ship: handles empty argument" "$SHIP" "empty argument / malformed shape / no such ticket" && pass "ship: handles empty argument" +should_run "ship: handles malformed ticket-id shape" && assert_file_contains "ship: handles malformed ticket-id shape" "$SHIP" "does not match this repo's ticket-id shape" && pass "ship: handles malformed ticket-id shape" +should_run "ship: handles nonexistent ticket (checks both BACKLOG and archive)" && assert_file_contains "ship: handles nonexistent ticket (checks both BACKLOG and archive)" "$SHIP" "BACKLOG-ARCHIVE.md" && pass "ship: handles nonexistent ticket (checks both BACKLOG and archive)" +should_run "ship: distinguishes fail-fast validation from the discussion stop" && assert_file_contains "ship: distinguishes fail-fast validation from the discussion stop" "$SHIP" "not a discussion point" && pass "ship: distinguishes fail-fast validation from the discussion stop" +# dirty-tree precondition is deterministic fail-safe, not a second ask path +should_run "ship: dirty tree is stashed automatically, not asked about" && assert_file_contains "ship: dirty tree is stashed automatically, not asked about" "$SHIP" "git stash -u" && pass "ship: dirty tree is stashed automatically, not asked about" +should_run "ship: never stops to ask about a dirty tree" && assert_file_contains "ship: never stops to ask about a dirty tree" "$SHIP" "stop to ask about it" && pass "ship: never stops to ask about a dirty tree" # implementation stays main-thread, not dispatched should_run "ship: implementation is not dispatched to an executor" && assert_file_contains "ship: implementation is not dispatched to an executor" "$SHIP" "to codex/claude/opencode" && pass "ship: implementation is not dispatched to an executor" # gate loop contract @@ -389,12 +398,12 @@ should_run "ship: invokes pmctl gate run --executor codex for review" && assert_ should_run "ship: never invokes pr-gate.sh directly" && assert_file_contains "ship: never invokes pr-gate.sh directly" "$SHIP" "never \`bash scripts/pr-gate.sh\` directly" && pass "ship: never invokes pr-gate.sh directly" if should_run "ship: every gate invocation uses --lifecycle foreground"; then ship_flat=$(tr '\n' ' ' < "$SHIP" | tr -s ' ') - ship_gate_calls=$(grep -oE 'pmctl gate run --executor codex --cd "\$PWD"' <<< "$ship_flat" | wc -l) - ship_foreground_calls=$(grep -oE 'pmctl gate run --executor codex --cd "\$PWD"[^`]*--lifecycle foreground' <<< "$ship_flat" | wc -l) + ship_gate_calls=$(grep -oE 'pmctl gate run --executor codex' <<< "$ship_flat" | wc -l) + ship_foreground_calls=$(grep -oE 'pmctl gate run --executor codex[^`]*--lifecycle foreground' <<< "$ship_flat" | wc -l) if [[ "$ship_gate_calls" -gt 0 && "$ship_gate_calls" -eq "$ship_foreground_calls" ]]; then pass "ship: every gate invocation uses --lifecycle foreground" else - fail "ship: every gate invocation uses --lifecycle foreground" "found $ship_gate_calls gate run call(s) with --cd but only $ship_foreground_calls paired with --lifecycle foreground in $SHIP" + fail "ship: every gate invocation uses --lifecycle foreground" "found $ship_gate_calls occurrence(s) of the gate call but only $ship_foreground_calls paired with --lifecycle foreground in $SHIP" fi fi should_run "ship: explains why detached+wait is unnecessary here" && assert_file_contains "ship: explains why detached+wait is unnecessary here" "$SHIP" "nothing else for the main thread to do while it waits" && pass "ship: explains why detached+wait is unnecessary here" @@ -404,6 +413,14 @@ should_run "ship: re-runs gate with --reviewers targeting" && assert_file_contai should_run "ship: references project-pm Rules A/B synthesis" && assert_file_contains "ship: references project-pm Rules A/B synthesis" "$SHIP" "Rules A/B" && pass "ship: references project-pm Rules A/B synthesis" # exactly two stop conditions, no more should_run "ship: stop condition heading enumerates the loop's halt cases" && assert_file_contains "ship: stop condition heading enumerates the loop's halt cases" "$SHIP" "Stop the loop only when" && pass "ship: stop condition heading enumerates the loop's halt cases" +if should_run "ship: exactly one genuine wait-for-user-direction path"; then + ship_wait_count=$(grep -c "wait for the user's direction" "$SHIP") + if [[ "$ship_wait_count" -eq 1 ]]; then + pass "ship: exactly one genuine wait-for-user-direction path" + else + fail "ship: exactly one genuine wait-for-user-direction path" "expected exactly 1 wait-for-user-direction occurrence, found $ship_wait_count in $SHIP" + fi +fi if should_run "ship: stop-condition list has exactly two numbered cases"; then ship_stop_count=$(grep -cE '^[0-9]+\. ' "$SHIP") if [[ "$ship_stop_count" -eq 2 ]]; then From e008eb6bae2a305591c626d6c9f6076cd3dd0a17 Mon Sep 17 00:00:00 2001 From: screenleon Date: Thu, 2 Jul 2026 17:31:40 +0900 Subject: [PATCH 5/7] fix(CC-439): ship narrows archive lookup to error-message-only, covers branch/push Addresses gate-20260702-082540-d2de44 critic advise + qa-tester block: - BACKLOG-ARCHIVE.md is now consulted only to distinguish "already archived" from "no such ticket" in the error message; the consistency check always reads from an active BACKLOG.md heading, never the archive. - Added structural coverage for the git checkout -b and git push -u publication steps, which had no adjacent test. --- commands/ship.md | 32 +++++++++++++++++++++----------- scripts/test-commands.sh | 9 +++++++-- 2 files changed, 28 insertions(+), 13 deletions(-) diff --git a/commands/ship.md b/commands/ship.md index 66bc80f..15f6628 100644 --- a/commands/ship.md +++ b/commands/ship.md @@ -32,17 +32,27 @@ ambiguous* and needs the user's judgment. ## Step 0 — Validate the ticket id, then check consistency -**Ticket-id validation** (fail fast, not a discussion point): if `$ARGUMENTS` -is empty, does not match this repo's ticket-id shape (`-` per -`pm/schema.md`), or has no matching `## ` heading in either -`BACKLOG.md` or `BACKLOG-ARCHIVE.md`, stop immediately and report the exact -problem (empty argument / malformed shape / no such ticket) — this is a plain -input error, resolved by the caller supplying a valid id, not something to -deliberate about. - -**Consistency check**: once the ticket id resolves, read its full body from -`BACKLOG.md` (`grep -n '^## '` then `sed -n` the section — do not -full-file Read). Extract `Problem` / `Requirement` / `Dependencies`. +**Ticket-id validation** (fail fast, not a discussion point): `/ship` only +ever acts on an active `BACKLOG.md` ticket — `BACKLOG-ARCHIVE.md` is +consulted solely to produce a precise error message, never as a source to +implement from. + +- If `$ARGUMENTS` is empty or does not match this repo's ticket-id shape + (`-` per `pm/schema.md`): stop and report "empty argument" or + "malformed shape". +- If there is no matching `## ` heading in `BACKLOG.md`: check + `BACKLOG-ARCHIVE.md`. A match there means the ticket is already terminal + (done/closed/dropped/superseded) — stop and report "ticket already + archived", not "no such ticket". No match in either file: stop and report + "no such ticket". + +Either way this is a plain input error, resolved by the caller supplying a +valid, currently-active ticket id, not something to deliberate about. + +**Consistency check**: once the ticket id resolves to an active `BACKLOG.md` +heading, read its full body (`grep -n '^## '` then `sed -n` the +section — do not full-file Read). Extract `Problem` / `Requirement` / +`Dependencies`. - **Dependencies**: for every ticket referenced in `Dependencies`, confirm its status in `BACKLOG.md`'s index table (or `BACKLOG-ARCHIVE.md` if terminal) diff --git a/scripts/test-commands.sh b/scripts/test-commands.sh index fb95fab..c12cd36 100755 --- a/scripts/test-commands.sh +++ b/scripts/test-commands.sh @@ -384,9 +384,11 @@ should_run "ship: conflict stops before branching or implementing" && assert_fil should_run "ship: keeps DECISIONS.md out of dispatch briefs" && assert_file_contains "ship: keeps DECISIONS.md out of dispatch briefs" "$SHIP" "do not paste it into any dispatch brief" && pass "ship: keeps DECISIONS.md out of dispatch briefs" # ticket-id validation: empty / malformed / nonexistent must fail fast, distinct from the discussion stop should_run "ship: validates ticket id before any other step" && assert_file_contains "ship: validates ticket id before any other step" "$SHIP" "Ticket-id validation" && pass "ship: validates ticket id before any other step" -should_run "ship: handles empty argument" && assert_file_contains "ship: handles empty argument" "$SHIP" "empty argument / malformed shape / no such ticket" && pass "ship: handles empty argument" +should_run "ship: handles empty argument" && assert_file_contains "ship: handles empty argument" "$SHIP" "stop and report \"empty argument\"" && pass "ship: handles empty argument" should_run "ship: handles malformed ticket-id shape" && assert_file_contains "ship: handles malformed ticket-id shape" "$SHIP" "does not match this repo's ticket-id shape" && pass "ship: handles malformed ticket-id shape" -should_run "ship: handles nonexistent ticket (checks both BACKLOG and archive)" && assert_file_contains "ship: handles nonexistent ticket (checks both BACKLOG and archive)" "$SHIP" "BACKLOG-ARCHIVE.md" && pass "ship: handles nonexistent ticket (checks both BACKLOG and archive)" +should_run "ship: distinguishes already-archived ticket from no-such-ticket" && assert_file_contains "ship: distinguishes already-archived ticket from no-such-ticket" "$SHIP" "the ticket is already terminal" && pass "ship: distinguishes already-archived ticket from no-such-ticket" +should_run "ship: consistency check only ever reads active BACKLOG.md" && assert_file_contains "ship: consistency check only ever reads active BACKLOG.md" "$SHIP" "resolves to an active \`BACKLOG.md\`" && pass "ship: consistency check only ever reads active BACKLOG.md" +should_run "ship: BACKLOG-ARCHIVE.md is error-message-only, never a source to implement from" && assert_file_contains "ship: BACKLOG-ARCHIVE.md is error-message-only, never a source to implement from" "$SHIP" "never as a source to" && pass "ship: BACKLOG-ARCHIVE.md is error-message-only, never a source to implement from" should_run "ship: distinguishes fail-fast validation from the discussion stop" && assert_file_contains "ship: distinguishes fail-fast validation from the discussion stop" "$SHIP" "not a discussion point" && pass "ship: distinguishes fail-fast validation from the discussion stop" # dirty-tree precondition is deterministic fail-safe, not a second ask path should_run "ship: dirty tree is stashed automatically, not asked about" && assert_file_contains "ship: dirty tree is stashed automatically, not asked about" "$SHIP" "git stash -u" && pass "ship: dirty tree is stashed automatically, not asked about" @@ -436,6 +438,9 @@ should_run "ship: opens PR via gh pr create" && assert_file_contains "ship: open should_run "ship: PR body template records gate rounds and verdict" && assert_file_contains "ship: PR body template records gate rounds and verdict" "$SHIP" "Final verdict" && pass "ship: PR body template records gate rounds and verdict" should_run "ship: GO is not merge authorization" && assert_file_contains "ship: GO is not merge authorization" "$SHIP" "GO is not merge authorization" && pass "ship: GO is not merge authorization" should_run "ship: no CC ticket references" && assert_not_contains "ship: no CC ticket references" "$SHIP" "CC-[0-9]" +# git publication path: branch creation and push are the only side effects before PR creation +should_run "ship: creates the feature branch via git checkout -b" && assert_file_contains "ship: creates the feature branch via git checkout -b" "$SHIP" "git checkout -b feat/" && pass "ship: creates the feature branch via git checkout -b" +should_run "ship: pushes the branch before opening the PR" && assert_file_contains "ship: pushes the branch before opening the PR" "$SHIP" "git push -u origin feat/" && pass "ship: pushes the branch before opening the PR" # ── summary ────────────────────────────────────────────────────────────────── From 097c7e3177889138083b959c711ac512ff61bc0e Mon Sep 17 00:00:00 2001 From: screenleon Date: Thu, 2 Jul 2026 17:36:44 +0900 Subject: [PATCH 6/7] =?UTF-8?q?polish(CC-439):=20address=20round-5=20GO=20?= =?UTF-8?q?advisories=20=E2=80=94=20no=20auto-stash,=20scope=20stop-count?= =?UTF-8?q?=20test?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit gate-20260702-083147-97804f returned GO with two critic/qa-tester advisories: - Dirty-tree handling no longer auto-stashes (a surprising repo mutation); it now fails fast and asks the caller to clean the tree themselves, same fail-fast bucket as ticket-id validation, not the negotiated stop. - The "exactly two stop conditions" test is now scoped to the "Stop the loop only when" section instead of counting numbered items file-wide. --- commands/ship.md | 28 +++++++++++++++++----------- scripts/test-commands.sh | 8 ++++---- 2 files changed, 21 insertions(+), 15 deletions(-) diff --git a/commands/ship.md b/commands/ship.md index 15f6628..b4a7af8 100644 --- a/commands/ship.md +++ b/commands/ship.md @@ -75,12 +75,16 @@ If clear: continue to Step 1. ## Step 1 — Branch -**Dirty-tree precondition** (fail-safe, not a discussion point): run `git -status` first. If the tree is dirty with changes unrelated to this ticket, -`git stash -u` before branching (note the stash in the Step 5 report so it's -easy to recover) — never branch over uncommitted work silently, and never -stop to ask about it, since stashing is reversible and there is nothing to -deliberate. +**Dirty-tree precondition** (fail fast, not a discussion point — same bucket +as Step 0's ticket-id validation): run `git status` first. If the tree is +dirty with changes unrelated to this ticket, stop immediately and report that +the tree must be clean before `/ship` will branch — do not stash, commit, or +otherwise mutate the caller's uncommitted work on their behalf. This has one +predetermined resolution (the caller commits or stashes it themselves and +re-invokes `/ship`), so it is not the negotiated stop this command reserves +for genuine ambiguity, and it is not an automatic mutation either — never +branch over uncommitted work silently, and never take a repo-mutating action +the caller did not ask for. ```bash git checkout -b feat/ @@ -157,9 +161,11 @@ explicitly says so. ## Step 5 — Close-out report -Report one of three outcomes: (1) invalid ticket id — the exact problem -(empty argument / malformed shape / no such ticket); (2) consistency-check +Report one of four outcomes: (1) invalid ticket id — the exact problem +(empty argument / malformed shape / no such ticket / already archived); (2) +dirty tree — that `/ship` aborted without touching the caller's uncommitted +work, and that a clean tree is required to re-invoke; (3) consistency-check stop — the ticket id, the conflicting `DECISIONS.md` entry or unmet -dependency, and what decision is needed from the user; or (3) PR opened — -ticket id, what changed, how many gate rounds it took, the final verdict, the -PR URL, and whether Step 1 stashed pre-existing changes. +dependency, and what decision is needed from the user; or (4) PR opened — +ticket id, what changed, how many gate rounds it took, the final verdict, and +the PR URL. diff --git a/scripts/test-commands.sh b/scripts/test-commands.sh index c12cd36..ea1bdc5 100755 --- a/scripts/test-commands.sh +++ b/scripts/test-commands.sh @@ -391,8 +391,8 @@ should_run "ship: consistency check only ever reads active BACKLOG.md" && assert should_run "ship: BACKLOG-ARCHIVE.md is error-message-only, never a source to implement from" && assert_file_contains "ship: BACKLOG-ARCHIVE.md is error-message-only, never a source to implement from" "$SHIP" "never as a source to" && pass "ship: BACKLOG-ARCHIVE.md is error-message-only, never a source to implement from" should_run "ship: distinguishes fail-fast validation from the discussion stop" && assert_file_contains "ship: distinguishes fail-fast validation from the discussion stop" "$SHIP" "not a discussion point" && pass "ship: distinguishes fail-fast validation from the discussion stop" # dirty-tree precondition is deterministic fail-safe, not a second ask path -should_run "ship: dirty tree is stashed automatically, not asked about" && assert_file_contains "ship: dirty tree is stashed automatically, not asked about" "$SHIP" "git stash -u" && pass "ship: dirty tree is stashed automatically, not asked about" -should_run "ship: never stops to ask about a dirty tree" && assert_file_contains "ship: never stops to ask about a dirty tree" "$SHIP" "stop to ask about it" && pass "ship: never stops to ask about a dirty tree" +should_run "ship: dirty tree aborts fail-fast, does not auto-mutate" && assert_file_contains "ship: dirty tree aborts fail-fast, does not auto-mutate" "$SHIP" "do not stash, commit, or" && pass "ship: dirty tree aborts fail-fast, does not auto-mutate" +should_run "ship: dirty-tree abort is not the negotiated stop" && assert_file_contains "ship: dirty-tree abort is not the negotiated stop" "$SHIP" "not the negotiated stop this command reserves" && pass "ship: dirty-tree abort is not the negotiated stop" # implementation stays main-thread, not dispatched should_run "ship: implementation is not dispatched to an executor" && assert_file_contains "ship: implementation is not dispatched to an executor" "$SHIP" "to codex/claude/opencode" && pass "ship: implementation is not dispatched to an executor" # gate loop contract @@ -424,11 +424,11 @@ if should_run "ship: exactly one genuine wait-for-user-direction path"; then fi fi if should_run "ship: stop-condition list has exactly two numbered cases"; then - ship_stop_count=$(grep -cE '^[0-9]+\. ' "$SHIP") + ship_stop_count=$(awk '/^\*\*Stop the loop only when\*\*:/{in_sec=1; next} in_sec && /^## /{exit} in_sec && /^[0-9]+\. /{c++} END{print c+0}' "$SHIP") if [[ "$ship_stop_count" -eq 2 ]]; then pass "ship: stop-condition list has exactly two numbered cases" else - fail "ship: stop-condition list has exactly two numbered cases" "expected exactly 2 top-level numbered items, found $ship_stop_count in $SHIP" + fail "ship: stop-condition list has exactly two numbered cases" "expected exactly 2 numbered items in the 'Stop the loop only when' section, found $ship_stop_count in $SHIP" fi fi should_run "ship: round count alone is not a stop signal" && assert_file_contains "ship: round count alone is not a stop signal" "$SHIP" "this is taking many rounds" && pass "ship: round count alone is not a stop signal" From 6b217596e6a332425d2415cd2442e93e8f0491e8 Mon Sep 17 00:00:00 2001 From: screenleon Date: Thu, 2 Jul 2026 17:41:12 +0900 Subject: [PATCH 7/7] test(CC-439): pin Step 5 close-out report's four named outcomes Addresses gate-20260702-083655-f40009 critic advise + qa-tester block: Step 5's close-out contract (introduced while fixing round-5 advisories) had no adjacent test, so it could silently erode. Added direct assertions for each of the four named outcomes: invalid ticket id, dirty-tree abort, consistency-check stop, and PR opened with URL. --- scripts/test-commands.sh | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/scripts/test-commands.sh b/scripts/test-commands.sh index ea1bdc5..44bafa7 100755 --- a/scripts/test-commands.sh +++ b/scripts/test-commands.sh @@ -441,6 +441,13 @@ should_run "ship: no CC ticket references" && assert_not_contains "ship: no CC t # git publication path: branch creation and push are the only side effects before PR creation should_run "ship: creates the feature branch via git checkout -b" && assert_file_contains "ship: creates the feature branch via git checkout -b" "$SHIP" "git checkout -b feat/" && pass "ship: creates the feature branch via git checkout -b" should_run "ship: pushes the branch before opening the PR" && assert_file_contains "ship: pushes the branch before opening the PR" "$SHIP" "git push -u origin feat/" && pass "ship: pushes the branch before opening the PR" +# Step 5 close-out report: pins all four named outcomes so the reporting contract cannot silently erode +should_run "ship: has Step 5 close-out report" && assert_file_contains "ship: has Step 5 close-out report" "$SHIP" "## Step 5 — Close-out report" && pass "ship: has Step 5 close-out report" +should_run "ship: close-out report has exactly four named outcomes" && assert_file_contains "ship: close-out report has exactly four named outcomes" "$SHIP" "Report one of four outcomes" && pass "ship: close-out report has exactly four named outcomes" +should_run "ship: close-out outcome 1 covers invalid ticket id" && assert_file_contains "ship: close-out outcome 1 covers invalid ticket id" "$SHIP" "(1) invalid ticket id" && pass "ship: close-out outcome 1 covers invalid ticket id" +should_run "ship: close-out outcome 2 covers dirty-tree abort" && assert_file_contains "ship: close-out outcome 2 covers dirty-tree abort" "$SHIP" "(2)" && assert_file_contains "ship: close-out outcome 2 covers dirty-tree abort" "$SHIP" "dirty tree — that \`/ship\` aborted" && pass "ship: close-out outcome 2 covers dirty-tree abort" +should_run "ship: close-out outcome 3 covers consistency-check stop" && assert_file_contains "ship: close-out outcome 3 covers consistency-check stop" "$SHIP" "(3) consistency-check" && pass "ship: close-out outcome 3 covers consistency-check stop" +should_run "ship: close-out outcome 4 covers PR opened with URL" && assert_file_contains "ship: close-out outcome 4 covers PR opened with URL" "$SHIP" "or (4) PR opened" && assert_file_contains "ship: close-out outcome 4 covers PR opened with URL" "$SHIP" "the PR URL." && pass "ship: close-out outcome 4 covers PR opened with URL" # ── summary ──────────────────────────────────────────────────────────────────