Field retrospective: PulseEngine skills over a long cross-repo session — strengths, the requirements↔verification gap, and two proposed skills

A consumer-side retrospective from a long agent session driving the gale BYO-OS / gust dissolve work, which fanned out across **gale, synth, meld, wit-bindgen, kiln, rivet**. Written to improve the skills, grounded in what actually happened.

## What worked well
- **`issue-hunt`** — the watermark + `pending_gates` + self-echo filter is the right shape. The watermark keeps passes incremental; `pending_gates` (carry a deferred merge across passes with `action_on_green`) made "merge-when-green" reliable across many PRs without dropping any. The self-echo filter (don't re-triage your own comments) is essential and worked.
- **`oracle-gate-a-change`** — the single highest-value discipline. "Report the actual exit code, never a grepped all-green" caught real things this session: a `bazel test` exit-1 that was a **local Rocq toolchain failure** (`smpl.Smpl`), not a regression; repeated **ENOSPC** CI reds (runner disk, not code); and — when I held to "re-derive from the live artifact" — a **factual slip in my own issue comment** (a stale `.o` reported `bss=65536`; the current artifact was `1,048,720`). Green-or-explain is doing its job.
- **`pulseengine-feature-loop`** — the rivet → oracle → clean-room → PR spine gave good structure; the cold clean-room subagent is a nice independent check.
- **Cross-repo root-causing** *did* converge correctly (synth → meld → wit-bindgen), but only because it was driven by hand (see Gap 2).

## Gap 1 — requirements ≫ verification mapping (you flagged this; here's the evidence)
`rivet validate` on gale today:
- **927 artifacts**, ~**428 requirements** (269 sw-req + 50 system-req + 109 stakeholder-req).
- **311 "missing verification" warnings** — **257 of 269 sw-reqs (95%) have no `unit-verification` + `sw-integration-verification`** linked; 46 system-reqs miss `sys-integration-verification`.
- Verification artifacts *do* exist (98 sw-verification + 58 unit-verification + 37 sw-integration-verification + 40 sys-verification) — but they cover a minority, and **none I could find link to the actual runnable oracle** (the `tests/differential_*.rs`, Kani harnesses, Verus proofs that genuinely run). So the chain *requirement → verification artifact → real test* is broken at both hops for most requirements.

**Does the tooling allow the mapping? Yes** — the schema has the verification types + `verifies`/`satisfied-by` links. **But the workflow never binds it continuously.** `rivet validate`'s missing-verification is a non-blocking *warning*, so requirements accumulate un-verified; the only place it's enforced is `release-execution`'s traceability-completeness gate — i.e. at release, far too late, after 257 have piled up. The feature-loop's step 3 ("write requirements/decisions/**tests**") is easy to satisfy with the requirement and skip the test — which is exactly what happened (I added `SAC-`/`FIND-` artifacts without a verifying test bound to a runnable oracle).

**Proposal:** a **`bind-verification`** skill/step (or a hard sub-gate inside `oracle-gate-a-change`): for each new/`approved` requirement, require a `verifies`-linked verification artifact **whose `source` points at the concrete oracle that ran** (test path / Kani harness / Verus obligation / `nm` check), and surface the missing-verification count as a **tracked debt with a burn-down**, not a silent warning. The oracle the loop already runs *is* the verification — the gap is just recording the link, continuously.

## Gap 2 — there is no cross-repo coordination / root-cause-routing skill
The **dominant** activity this session had no skill: take a symptom in gale, root-cause it to the owning layer, file at the right repo, forward upstream, cross-link, and keep a *consumer tracker* distinct from the *upstream fix*. Concretely the `memory.grow` chain became: gale#89 (consumer) ← meld#298 (fusion) ← **wit-bindgen#6/#4** (root: the binding emits the allocator) — after first being **mis-filed on synth** (closed as wrong layer). That routing ("codegen→synth, fusion→meld, bindings→wit-bindgen") is real knowledge that lived only in my head and the user's corrections.

**Proposal:** a **`cross-repo-trace`** (a.k.a. `upstream-root-cause`) skill: a layer map for the toolchain, a rule to **file at the owning layer + open/maintain a consumer tracker that links to it**, forward-and-cross-link on redirect, and — the lesson the user kept teaching — **pose the design question to the owning layer *before* building** (synth#404 → redirected to meld#300, decided there, *then* implement). It would also encode "don't assume the layer; the first guess is often one layer too shallow (synth, when it's really meld, when it's really wit-bindgen)."

## How instruction/adjustment actually works (you asked)
The corrections that steered me best were **terse, high-signal, and about routing or sequencing**, not content:
- **Routing**: "git it to meld", "redirected to meld 300 as this is needed first" — you consistently knew the *owning layer* before I did.
- **Sequence**: "shouldn't we start with a design question to synth" — pose the question upstream before writing code.
- **Persona model**: "it is always me just with the different AI agent personas" — the cross-repo dialogue under one account is deliberate; once I knew that I stopped over-hedging about 'self-commenting'.
- **Rigor**: "slow down on public-comment claims" after factual slips — which led to the re-derive-from-live-artifact rule that then caught the next slip.

What I'd bake into the operating contract from this: (1) **route to the owning layer early** (ask "whose layer is this?" before filing/building); (2) **design-question-upstream-first** for anything touching another tool; (3) **re-derive numeric/layout claims from the live artifact** before any public comment. The first two are exactly what a `cross-repo-trace` skill would carry.

## Net
The skills are strong where they're grounded (oracle-gate) and incremental (issue-hunt). The two missing pieces are **continuous requirement↔verification binding** and a **cross-repo coordination skill** — both showed up as the friction I hit most. Happy to prototype either as a skill PR if useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Field retrospective: PulseEngine skills over a long cross-repo session — strengths, the requirements↔verification gap, and two proposed skills #92

What worked well

Gap 1 — requirements ≫ verification mapping (you flagged this; here's the evidence)

Gap 2 — there is no cross-repo coordination / root-cause-routing skill

How instruction/adjustment actually works (you asked)

Net

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Field retrospective: PulseEngine skills over a long cross-repo session — strengths, the requirements↔verification gap, and two proposed skills #92

Description

What worked well

Gap 1 — requirements ≫ verification mapping (you flagged this; here's the evidence)

Gap 2 — there is no cross-repo coordination / root-cause-routing skill

How instruction/adjustment actually works (you asked)

Net

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions