feat: rule nudges — agent-facing prompts on deny verdicts by knhn1004 · Pull Request #53 · openagentlock/OpenAgentLock

knhn1004 · 2026-05-03T22:45:04Z

Summary

Rule authors can now attach a `nudge: ` to any `evaluate[]` clause. When the matched verdict is `deny`, every harness shim (Claude Code, Codex, Cursor) appends the nudge to the deny reason as `"\n\n→ Suggested: "`. The format string is intentionally stable so external tooling can grep `→ Suggested: ` to spot the hint.

Use cases (per the screenshot context this came from):

Block this command, prefer this one — `rm -rf` denied with nudge `"use `trash ` instead — recoverable from Trash"`. Agent sees the hint, retries with `trash`.
Force a skill — secret reads denied with nudge `"use the secret-fetcher skill from openagentlock/skills"`. Agent learns the right entrypoint instead of just being blocked.

Companion PR adding the `nudge` field to the schema + two example rules: openagentlock/rules#2.

Backward compat

Purely additive. Existing rules without a `nudge:` continue to work unchanged. The `nudge` field on `/v1/gates/check` JSON uses `omitempty` — clients that ignore it see the same wire shape as before.

What's in this PR (5 commits)

`feat(daemon): plumb nudge field from policy rule through verdict response` — `policy.EvalResult.Nudge`, `Gate.Evals` (replaces parallel slices), `/v1/gates/check` JSON `nudge` field with `omitempty`. 8 unit + integration tests.
`fix(daemon): preserve nudge through monitor for firewall escalation; replace parallel slice with struct` — addresses the firewall-escalation gap (daemon-firewall + policy-monitor → deny was dropping the nudge). Strip moved up to `mode.go` based on FINAL verdict.
`feat(hooks): concatenate nudge into deny reason for claude/codex/cursor` — new `denyReasonWithNudge` helper; wired into 4 deny-reply sites (1 Claude, 1 Codex, 2 Cursor — pre-tool + before-shell). 4 daemon httptest tests + 3 CLI subprocess tests.
`docs: document the rule nudge field across policies, api, and hooks references` — `docs/guide/policies.md` (new `### Nudges` subsection), `docs/reference/api.md` (response shape), `docs/reference/hooks.md` (deny reply format). `mkdocs build --strict` clean.
`test(e2e): nudge round-trip via fake-hook for Bash and Read tools` — 3 new e2e tests covering deny+nudge for Bash, deny+nudge for Read, and `omitempty` wire-level absence on allow.

Daemon mode interaction matrix (verified by tests):

daemon mode	policy mode	rule has nudge	final verdict	nudge on wire
default	enforce	yes	deny	yes
default	monitor	yes	allow (monitor pass)	no
firewall	monitor	yes	deny (escalation)	yes
monitor	enforce	yes	allow (suppression)	no

Test plan

`go test -race ./internal/api/... ./internal/policy/...` from `control-plane/` — clean.
`bun test` from `cli/` — 153 pass / 1 skipped (e2e gated on `go` availability) / 0 fail.
`bun test tests/e2e.test.ts` from `cli/` (with `go` + ledger staticlib available) — 27 pass / 1 skip / 0 fail.
`mkdocs build --strict` — clean.
CI green.

Out of scope

Hard command rewrite (`rm` → `trash` via daemon-side substitution). The user's design conversation discussed this; we landed on nudge-only as the MVP because (a) it's purely additive across every harness contract, (b) the agent self-corrects on the nudge text just as effectively, and (c) command rewriting requires per-harness contract changes that aren't equally well-supported across Claude Code / Codex / Cursor today. Nudge-only ships the user's intent ("If you see something, do something instead — and that 'do something' is a prompt") without invasive harness work.

🤖 Generated with Claude Code

…onse Adds optional human-readable hint propagation from policy YAML `evaluate[].nudge` through the policy evaluator into the gate.check HTTP response. Hint is only surfaced on deny verdicts; allow / monitor downgrades / daemon-monitor suppression all clear it so the agent never sees remediation guidance for a call it's allowed to make. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…replace parallel slice with struct C1: nudge was stripped at the policy layer during monitor downgrade, so when daemon=firewall escalated a MonitorMatch back to deny, the hint was already gone. Move nudge-stripping up to the API layer (applyDaemonModeOverride): policy.Evaluate now carries Nudge through the monitor branch, and the daemon override decides whether to surface or clear it based on the final user-visible verdict. I1: replace Gate.Evaluators []Evaluator + Gate.EvalNudges []string parallel slices with a single Gate.Evals []evalEntry. The two slices were a footgun (could drift out of sync); welding them together removes the bounds-check at the firing-index lookup. External callers use the new Gate.Evaluators() accessor for type-name introspection. M1: strengthen the daemon-monitor strip-site comment to call out that the policy layer leaves Nudge populated and the API layer clears it because the agent is being allowed to proceed. Tests: add TestApplyDaemonModeOverride_FirewallEscalatesWithNudge covering the C1 win, plus _MonitorMatchStripsNudge (relocated cousin of the old policy-layer test) and _MonitorSuppressesDenyAndStripsNudge. The renamed TestEvaluate_MonitorDowngradeKeepsNudge in the policy package now asserts the new behaviour: nudge survives the downgrade. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

When a policy rule with `nudge:` produces a deny verdict, surface the hint to the model through the harness's only inbound channel — the deny-reason text — by appending it as `\n\n→ Suggested: <hint>` to both permissionDecisionReason and stopReason. Implemented as a shared denyReasonWithNudge helper invoked by claudePreToolUseHandler, codexPreToolUseHandler, cursorGateHandler (preToolUse + beforeMCPExecution), and cursorBeforeShellHandler. Allow / monitor / non-matching paths see Nudge == "" so they pass through unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…eferences Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add three end-to-end tests that exercise the full nudge wire path — real daemon, real CLI subprocess, real policy file — covering both deny-with-nudge and allow-without-nudge cases. The omitempty case asserts wire-level absence (Object.keys / `in`) rather than empty string, so a regression that emits `nudge: ""` on allow paths is caught. The e2e fixture policy now includes two nudge-bearing gates: `safety.rm-suggest-trash` (Bash, `rm -rf`) and `safety.secret-read-suggest-skill` (Read, `**/.aws/credentials`). The Read path is chosen so it does not collide with the existing `rogue.secret-read` globs (`**/.env*`, `**/.ssh/**`). `safety.rm-suggest-trash` is intentionally placed BEFORE `rogue.destructive-bash` so a plain `rm -rf` fires the nudge rule; the legacy destructive-bash test was retargeted to `git push --force` (the second alternation in destructive-bash) so that rule's coverage is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a brief request-fields list mirroring the new response-fields list so the gates/check section is balanced rather than docs-half-an-endpoint. Per CodeRabbit feedback on PR #53. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

knhn1004 and others added 6 commits May 3, 2026 14:58

docs: document the rule nudge field across policies, api, and hooks r…

cdc85e0

…eferences Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

knhn1004 merged commit a3740ee into main May 3, 2026
5 checks passed

knhn1004 deleted the feat/rule-nudges branch May 3, 2026 22:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: rule nudges — agent-facing prompts on deny verdicts#53

feat: rule nudges — agent-facing prompts on deny verdicts#53
knhn1004 merged 6 commits intomainfrom
feat/rule-nudges

knhn1004 commented May 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

knhn1004 commented May 3, 2026

Summary

Backward compat

What's in this PR (5 commits)

Test plan

Out of scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant