Skip to content

Field retrospective: PulseEngine skills over a long cross-repo session — strengths, the requirements↔verification gap, and two proposed skills #92

Description

@avrabe

A consumer-side retrospective from a long agent session driving the gale BYO-OS / gust dissolve work, which fanned out across gale, synth, meld, wit-bindgen, kiln, rivet. Written to improve the skills, grounded in what actually happened.

What worked well

  • issue-hunt — the watermark + pending_gates + self-echo filter is the right shape. The watermark keeps passes incremental; pending_gates (carry a deferred merge across passes with action_on_green) made "merge-when-green" reliable across many PRs without dropping any. The self-echo filter (don't re-triage your own comments) is essential and worked.
  • oracle-gate-a-change — the single highest-value discipline. "Report the actual exit code, never a grepped all-green" caught real things this session: a bazel test exit-1 that was a local Rocq toolchain failure (smpl.Smpl), not a regression; repeated ENOSPC CI reds (runner disk, not code); and — when I held to "re-derive from the live artifact" — a factual slip in my own issue comment (a stale .o reported bss=65536; the current artifact was 1,048,720). Green-or-explain is doing its job.
  • pulseengine-feature-loop — the rivet → oracle → clean-room → PR spine gave good structure; the cold clean-room subagent is a nice independent check.
  • Cross-repo root-causing did converge correctly (synth → meld → wit-bindgen), but only because it was driven by hand (see Gap 2).

Gap 1 — requirements ≫ verification mapping (you flagged this; here's the evidence)

rivet validate on gale today:

  • 927 artifacts, ~428 requirements (269 sw-req + 50 system-req + 109 stakeholder-req).
  • 311 "missing verification" warnings257 of 269 sw-reqs (95%) have no unit-verification + sw-integration-verification linked; 46 system-reqs miss sys-integration-verification.
  • Verification artifacts do exist (98 sw-verification + 58 unit-verification + 37 sw-integration-verification + 40 sys-verification) — but they cover a minority, and none I could find link to the actual runnable oracle (the tests/differential_*.rs, Kani harnesses, Verus proofs that genuinely run). So the chain requirement → verification artifact → real test is broken at both hops for most requirements.

Does the tooling allow the mapping? Yes — the schema has the verification types + verifies/satisfied-by links. But the workflow never binds it continuously. rivet validate's missing-verification is a non-blocking warning, so requirements accumulate un-verified; the only place it's enforced is release-execution's traceability-completeness gate — i.e. at release, far too late, after 257 have piled up. The feature-loop's step 3 ("write requirements/decisions/tests") is easy to satisfy with the requirement and skip the test — which is exactly what happened (I added SAC-/FIND- artifacts without a verifying test bound to a runnable oracle).

Proposal: a bind-verification skill/step (or a hard sub-gate inside oracle-gate-a-change): for each new/approved requirement, require a verifies-linked verification artifact whose source points at the concrete oracle that ran (test path / Kani harness / Verus obligation / nm check), and surface the missing-verification count as a tracked debt with a burn-down, not a silent warning. The oracle the loop already runs is the verification — the gap is just recording the link, continuously.

Gap 2 — there is no cross-repo coordination / root-cause-routing skill

The dominant activity this session had no skill: take a symptom in gale, root-cause it to the owning layer, file at the right repo, forward upstream, cross-link, and keep a consumer tracker distinct from the upstream fix. Concretely the memory.grow chain became: gale#89 (consumer) ← meld#298 (fusion) ← wit-bindgen#6/#4 (root: the binding emits the allocator) — after first being mis-filed on synth (closed as wrong layer). That routing ("codegen→synth, fusion→meld, bindings→wit-bindgen") is real knowledge that lived only in my head and the user's corrections.

Proposal: a cross-repo-trace (a.k.a. upstream-root-cause) skill: a layer map for the toolchain, a rule to file at the owning layer + open/maintain a consumer tracker that links to it, forward-and-cross-link on redirect, and — the lesson the user kept teaching — pose the design question to the owning layer before building (synth#404 → redirected to meld#300, decided there, then implement). It would also encode "don't assume the layer; the first guess is often one layer too shallow (synth, when it's really meld, when it's really wit-bindgen)."

How instruction/adjustment actually works (you asked)

The corrections that steered me best were terse, high-signal, and about routing or sequencing, not content:

  • Routing: "git it to meld", "redirected to meld 300 as this is needed first" — you consistently knew the owning layer before I did.
  • Sequence: "shouldn't we start with a design question to synth" — pose the question upstream before writing code.
  • Persona model: "it is always me just with the different AI agent personas" — the cross-repo dialogue under one account is deliberate; once I knew that I stopped over-hedging about 'self-commenting'.
  • Rigor: "slow down on public-comment claims" after factual slips — which led to the re-derive-from-live-artifact rule that then caught the next slip.

What I'd bake into the operating contract from this: (1) route to the owning layer early (ask "whose layer is this?" before filing/building); (2) design-question-upstream-first for anything touching another tool; (3) re-derive numeric/layout claims from the live artifact before any public comment. The first two are exactly what a cross-repo-trace skill would carry.

Net

The skills are strong where they're grounded (oracle-gate) and incremental (issue-hunt). The two missing pieces are continuous requirement↔verification binding and a cross-repo coordination skill — both showed up as the friction I hit most. Happy to prototype either as a skill PR if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions