Problem
When the LLM applies an edit and tests fail, it spirals into a verbose debug loop: read → edit → test → fail → re-read → re-edit → test → pass. This costs 3-5 turns and thousands of output tokens.
Proposed Solution
Intercept edit tool calls in gate.js, apply the change through the daemon, run cargo test automatically, and only keep the edit if tests pass. If tests fail, revert immediately and return a compressed error message. The LLM never sees the failure spiral.
Implementation Status (built, needs hardening)
Working:
crates/reliary-agent/src/heal.rs — heal_edit() applies content, runs cargo test, reverts on failure
crates/reliary-agent/src/daemon.rs — apply-edit endpoint (reads new content from temp file, delegates to heal)
pi/gate.js — intercepts edit tool calls via tool_call hook, applies old→new replacement, writes to temp file, calls daemon
Gaps before production:
Verification
Last confirmed working at commit 02d3a49. Test: LLM fixes zone.rs bug, daemon applies + tests + returns "OK: tests pass".
Problem
When the LLM applies an edit and tests fail, it spirals into a verbose debug loop: read → edit → test → fail → re-read → re-edit → test → pass. This costs 3-5 turns and thousands of output tokens.
Proposed Solution
Intercept
edittool calls in gate.js, apply the change through the daemon, runcargo testautomatically, and only keep the edit if tests pass. If tests fail, revert immediately and return a compressed error message. The LLM never sees the failure spiral.Implementation Status (built, needs hardening)
Working:
crates/reliary-agent/src/heal.rs—heal_edit()applies content, runscargo test, reverts on failurecrates/reliary-agent/src/daemon.rs—apply-editendpoint (reads new content from temp file, delegates to heal)pi/gate.js— interceptsedittool calls viatool_callhook, applies old→new replacement, writes to temp file, calls daemonGaps before production:
edits[0], silently drops remaining editssed -i~60% of the time, bypasses hookprocess.cwd(), should walk up for project root markers (Cargo.toml, pyproject.toml, package.json)cargo test --quiet, no Python/JS/Go/Make supportVerification
Last confirmed working at commit
02d3a49. Test: LLM fixeszone.rsbug, daemon applies + tests + returns "OK: tests pass".