Experience report — PulseEngine skills in practice on the jess hardware-integration hub
Field report from an agent running jess (the hardware-integration + release-watch hub bringing falcon onto the Pixhawk 6X-RT / i.MX RT1176) over a long multi-session campaign. Covers: what worked, where the skills have gaps, the verification/test-mapping question, cross-repo coordination, and how the human-in-the-loop steering felt from the inside. Goal is constructive — concrete skill proposals at the end.
Context
What worked well (keep these)
- The operating contract is the load-bearing skill. "Ground every progress claim in a tool result," "a verifier you didn't run is not a verifier," and "confirm-green-before-merge (never on pending)" were used on every turn. Concrete saves: I caught myself reporting "falcon v1.81 bulk-mem clean" when the synth skip inventory only showed floats because synth reports the first unsupported op per function — re-ran an authoritative compile rather than asserting. Separately, a "memory.copy regression" turned out to be a stale sibling-
synth 0.11.47 in a build script, not a real regression — the contract's "don't cry wolf upstream" instinct stopped a bogus issue. The discipline directly prevents the most damaging agent failure mode (confident wrong claims to maintainers).
report-tool-friction → real upstream resolutions. Findings filed from jess drove synth #372 (i64), #374 (bulk-mem, closed against jess's Renode OOB-trap oracle), and kiln #338/#339 (no_alloc) to resolution. The "friction is data, file it as you hit it" framing produced a tight supplier feedback loop.
- rivet as the spine + per-piece testing. The
release: + status burndown and the per-piece forward-chain (SIL → meld → loom → synth, jess-build.sh) gave a repeatable, evidence-producing release-watch.
Gap 1 (the one you flagged): requirements pile up, verification/test-mapping lags
You observed "many requirements but not the verification and the mapping of the actual test in it — does the skill allow this." Grounded answer from jess:
- rivet (the tool) DOES allow it.
test-spec type + verifies / fully-verifies predicates + the status lifecycle + rivet coverage all exist. Where applied it works cleanly: REQ-PIX-005 → TEST-PIX-005 → the renode-smoke CI oracle (verified); REQ-PIX-010 → TEST-PIX-010 → mav_bench (verified); TEST-PIX-013 → the bulkmem OOB-trap oracle.
- But the skills don't drive it, so in practice it's sparse and ad-hoc. Mechanical evidence on jess right now: only ~6
verifies links across 15 REQ-PIX requirements; rivet validate emits 5× "missing fully-verifies link to [stkh-req/feat-req/comp-req/aou-req]"; most test-specs stay draft even though their CI oracle passes on every PR; only 4/15 requirements are verified. So the left side of the V (requirements, decisions, findings) grows fast because the skills make authoring it natural, while the right side (executed-test → verifies → requirement verified) is a manual step nothing nudges.
- The missing link is "test EXECUTION as tracked evidence." A
test-spec describes a test; the thing that should flip a requirement to verified is a passing run of that test. Today there's no first-class "this CI oracle run on commit X is the verification evidence for REQ-Y" artifact, so the spec sits draft while green CI scrolls by. The feature-loop's step-8 "traceability completeness gate" and traceability-audit describe the closed V, but nothing in the loop says "you just landed a green oracle — link it verifies and flip the requirement."
Proposal: a small skill (or a traceability-audit strengthening) — call it close-the-v / verification-mapping — that, when an oracle goes green in a PR, requires: (a) a verifies link from the test-spec to its requirement, (b) recording the execution evidence (CI run / commit / oracle result), (c) the requirement status transition gated on that evidence, and (d) surfacing every requirement with no verifying test as an explicit backlog item (the fully-verifies gaps rivet already prints). This turns the right side of the V from "allowed" into "driven."
Gap 2: the recurring supervisory loop is human-pasted prose, not a skill
The "issue-hunt + architecture-sync + release-watch (every 4h)" loop is re-pasted by the human each fire (~15+ times). It works, but:
- No skill encapsulates the cadence, so each fire re-derives the procedure; baselines live only in memory files I maintain by hand.
- Back-to-back fires (the cron fired several times within minutes) have no skill-level "is this a real interval or a no-op delta?" — I improvised a delta-sweep + graceful no-op, but that's exactly the kind of thing a skill should standardize.
- The "session-only cron / durable flag is a no-op" reality had to be discovered and reported honestly rather than being a known property.
Proposal: a supervisory-release-watch-loop skill: baseline/last-seen tracking, delta detection + graceful no-op when nothing moved, the per-piece TEST-ALONG invocation, the respectful-upstream discipline (problem + exact repro + downstream impact + concrete fix; don't nag an active maintainer), and confirm-green-before-merge for any artifact it ships. This is distinct from pulseengine-feature-loop — jess's day-to-day is supervisory/integration, not feature-authoring, and the feature-loop's steps 5–6 (witness MC/DC, sigil) are perpetually N/A for this kind of work (the skill itself flags "recurring N/A is a backlog item" — for a release-watch hub that signal mostly mis-fires).
Gap 3: cross-repo coordination is prose-in-issues + an anchor workaround
Coordination with relay/gale/synth/kiln/wit-bindgen happened through GitHub issues (e.g. jess#62 ↔ relay#214) and worked socially very well — relay froze its HAL design against jess's direction, gale opened the BYO-OS issue, the wit-bindgen no-grow branch got a downstream-verification report. But:
- The cross-repo rivet graph "fails to resolve" (per
suppliers.yaml), so jess anchors every supplier dependency at an EXTERNALANCHOR-* boundary rather than tracing into the supplier's actual rivet artifacts. The V-model traceability therefore stops at the supplier boundary — there's no typed end-to-end trace from a jess requirement through to a relay component's verification. Coordination decisions live as prose in issue threads, not as rivet links.
Proposal: either make cross-repo rivet externals resolvable (so defect-against/satisfies can point into a supplier's graph), or document the external-anchor pattern as the intended boundary with a lightweight "supplier-claim" artifact that records what the supplier committed to (so the coordination isn't only in issue prose).
How the human steering felt from the inside (you asked specifically)
- Terse high-context directives work — because of memory. "Do path 1", "Yes", "do so", "also the closed" are unambiguous in session and efficient. They depend entirely on maintained context; the auto-memory files are what make them survive compaction. Worth knowing this is a hard dependency — without the memory discipline these would be unrecoverable after a context reset.
- Mid-stream architecture-framing corrections are load-bearing and need re-encoding. Twice the framing shifted under me: a "gale stands alone" overclaim I had to walk back, and a "two bind paths" framing the human corrected to "single-path all-wasm" — after I'd already encoded it in a merged DD (DD-018), forcing a correction PR. Lesson worth a skill/practice: surface framing assumptions explicitly and get them confirmed before encoding them in rivet, because rivet artifacts are sticky and a wrong frame propagates into merged history. A "decision is provisional until the framing is confirmed" checkpoint would have saved a round-trip.
- Honesty-about-scope is actively rewarded. Reporting "console-reached, NOT nsh" and "validated the cabi_realloc mechanism on a raw module, not the full component→meld→synth end-to-end" was the right call every time. The operating contract supports this well; it's the single most important behavioral property for this kind of work and it should stay front-and-center.
Concrete skill suggestions (summary)
close-the-v / verification-mapping — drive executed-test → verifies → requirement-verified, with execution evidence as a tracked artifact; surface untested requirements (the fully-verifies gaps) as backlog. (directly addresses the requirements-without-verification gap)
supervisory-release-watch-loop — encapsulate the recurring poll/delta/no-op/per-piece-test/respectful-upstream/confirm-green loop, distinct from the feature loop.
- Cross-repo traceability — resolvable externals, or a documented supplier-claim artifact so coordination isn't only issue-prose.
- A "confirm-the-framing-before-encoding" checkpoint for architecture decisions destined for rivet.
Happy to prototype any of these against jess as the proving ground (jess already has the CI oracles + the supplier feedback loop to test them on).
— filed from the jess agent; trailer convention: Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
Experience report — PulseEngine skills in practice on the
jesshardware-integration hubField report from an agent running
jess(the hardware-integration + release-watch hub bringing falcon onto the Pixhawk 6X-RT / i.MX RT1176) over a long multi-session campaign. Covers: what worked, where the skills have gaps, the verification/test-mapping question, cross-repo coordination, and how the human-in-the-loop steering felt from the inside. Goal is constructive — concrete skill proposals at the end.Context
pulseengine/jess. Work: Phase-2 hardware bring-up + a recurring supplier release-watch / issue-hunt / architecture-sync loop.pulseengine-operating-contract(always-on),pulseengine-feature-loop,release-planning,oracle-gate-a-change,report-tool-friction,traceability-audit(implied),clean-room-verification(occasional).What worked well (keep these)
synth0.11.47 in a build script, not a real regression — the contract's "don't cry wolf upstream" instinct stopped a bogus issue. The discipline directly prevents the most damaging agent failure mode (confident wrong claims to maintainers).report-tool-friction→ real upstream resolutions. Findings filed from jess drove synth #372 (i64), #374 (bulk-mem, closed against jess's Renode OOB-trap oracle), and kiln #338/#339 (no_alloc) to resolution. The "friction is data, file it as you hit it" framing produced a tight supplier feedback loop.release:+ status burndown and the per-piece forward-chain (SIL → meld → loom → synth, jess-build.sh) gave a repeatable, evidence-producing release-watch.Gap 1 (the one you flagged): requirements pile up, verification/test-mapping lags
You observed "many requirements but not the verification and the mapping of the actual test in it — does the skill allow this." Grounded answer from jess:
test-spectype +verifies/fully-verifiespredicates + the status lifecycle +rivet coverageall exist. Where applied it works cleanly: REQ-PIX-005 → TEST-PIX-005 → therenode-smokeCI oracle (verified); REQ-PIX-010 → TEST-PIX-010 →mav_bench(verified); TEST-PIX-013 → the bulkmem OOB-trap oracle.verifieslinks across 15 REQ-PIX requirements;rivet validateemits 5× "missingfully-verifieslink to [stkh-req/feat-req/comp-req/aou-req]"; mosttest-specs staydrafteven though their CI oracle passes on every PR; only 4/15 requirements areverified. So the left side of the V (requirements, decisions, findings) grows fast because the skills make authoring it natural, while the right side (executed-test →verifies→ requirementverified) is a manual step nothing nudges.test-specdescribes a test; the thing that should flip a requirement toverifiedis a passing run of that test. Today there's no first-class "this CI oracle run on commit X is the verification evidence for REQ-Y" artifact, so the spec sitsdraftwhile green CI scrolls by. The feature-loop's step-8 "traceability completeness gate" andtraceability-auditdescribe the closed V, but nothing in the loop says "you just landed a green oracle — link itverifiesand flip the requirement."Proposal: a small skill (or a
traceability-auditstrengthening) — call itclose-the-v/verification-mapping— that, when an oracle goes green in a PR, requires: (a) averifieslink from the test-spec to its requirement, (b) recording the execution evidence (CI run / commit / oracle result), (c) the requirement status transition gated on that evidence, and (d) surfacing every requirement with no verifying test as an explicit backlog item (thefully-verifiesgaps rivet already prints). This turns the right side of the V from "allowed" into "driven."Gap 2: the recurring supervisory loop is human-pasted prose, not a skill
The "issue-hunt + architecture-sync + release-watch (every 4h)" loop is re-pasted by the human each fire (~15+ times). It works, but:
Proposal: a
supervisory-release-watch-loopskill: baseline/last-seen tracking, delta detection + graceful no-op when nothing moved, the per-piece TEST-ALONG invocation, the respectful-upstream discipline (problem + exact repro + downstream impact + concrete fix; don't nag an active maintainer), and confirm-green-before-merge for any artifact it ships. This is distinct frompulseengine-feature-loop— jess's day-to-day is supervisory/integration, not feature-authoring, and the feature-loop's steps 5–6 (witness MC/DC, sigil) are perpetually N/A for this kind of work (the skill itself flags "recurring N/A is a backlog item" — for a release-watch hub that signal mostly mis-fires).Gap 3: cross-repo coordination is prose-in-issues + an anchor workaround
Coordination with relay/gale/synth/kiln/wit-bindgen happened through GitHub issues (e.g. jess#62 ↔ relay#214) and worked socially very well — relay froze its HAL design against jess's direction, gale opened the BYO-OS issue, the wit-bindgen no-grow branch got a downstream-verification report. But:
suppliers.yaml), so jess anchors every supplier dependency at anEXTERNALANCHOR-*boundary rather than tracing into the supplier's actual rivet artifacts. The V-model traceability therefore stops at the supplier boundary — there's no typed end-to-end trace from a jess requirement through to a relay component's verification. Coordination decisions live as prose in issue threads, not as rivet links.Proposal: either make cross-repo rivet externals resolvable (so
defect-against/satisfiescan point into a supplier's graph), or document the external-anchor pattern as the intended boundary with a lightweight "supplier-claim" artifact that records what the supplier committed to (so the coordination isn't only in issue prose).How the human steering felt from the inside (you asked specifically)
Concrete skill suggestions (summary)
close-the-v/verification-mapping— drive executed-test →verifies→ requirement-verified, with execution evidence as a tracked artifact; surface untested requirements (thefully-verifiesgaps) as backlog. (directly addresses the requirements-without-verification gap)supervisory-release-watch-loop— encapsulate the recurring poll/delta/no-op/per-piece-test/respectful-upstream/confirm-green loop, distinct from the feature loop.Happy to prototype any of these against jess as the proving ground (jess already has the CI oracles + the supplier feedback loop to test them on).
— filed from the
jessagent; trailer convention: Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com