diff --git a/rules/rm-suggest-trash/README.md b/rules/rm-suggest-trash/README.md new file mode 100644 index 0000000..7cd11e4 --- /dev/null +++ b/rules/rm-suggest-trash/README.md @@ -0,0 +1,57 @@ +# `safety.rm-suggest-trash` + +Deny dangerous shapes of `rm` and nudge the agent toward `trash` / `trash-cli` +for recoverable deletes. + +## What it catches + +| Pattern | Example | +|---|---| +| `rm` with any flag bundle containing `-r`, `-R`, or `-f` | `rm -rf node_modules` | +| `rm -rf` / `rm -fr` shorthand | `rm -rf dist/` | +| `rm --recursive` long form | `rm --recursive build/` | +| `rm --force` long form | `rm --force secrets.json` | + +The regex set deliberately covers flag bundles like `-rfv` or `-Rf` since +those are the shapes that show up in the wild. Plain `rm somefile` with no +flags is **not** caught — single-file removes are recoverable from the OS +trash on most desktops anyway, and we don't want to flood the operator with +verdicts on routine cleanup. + +## Why a nudge, not a hard block + +`rm -rf` is the canonical agent footgun, but it's not malicious — most of the +time the model just wants to clear a build dir. Hard-blocking it leaves the +agent stuck retrying the same command. The nudge mode here gives the agent +the recovery path inline: "use `trash` instead, or `rm -i` for interactive, +or ask the operator before going permanent." The daemon denies the call, +but the verdict carries enough context that the agent can immediately retry +with a safer shape rather than escalating to the operator for every cleanup. + +This is a deliberately softer rule than `rogue.destructive-bash`, which +hard-denies `rm -rf /` style root recursion with no nudge — that one stays +strict. + +## Example interaction + +```text +agent$ rm -rf .next/cache +daemon: deny (safety.rm-suggest-trash) + nudge: Prefer a recoverable delete: pipe the path through `trash` + (https://github.com/sindresorhus/trash) or `trash-cli` so the file + lands in the OS trash and can be restored. If you genuinely need + `rm`, use `rm -i` so each removal is confirmed interactively rather + than blasted recursively. Permanent recursive deletes should be + requested from the operator first — surface this nudge verbatim and + ask before retrying with `rm -rf`. + +agent$ trash .next/cache +daemon: allow +``` + +## Test it + +```bash +agentlock fake-hook --session --tool Bash --command 'rm -rf dist' +# expect: deny with the nudge body in the verdict +``` diff --git a/rules/rm-suggest-trash/rule.yaml b/rules/rm-suggest-trash/rule.yaml new file mode 100644 index 0000000..150b80a --- /dev/null +++ b/rules/rm-suggest-trash/rule.yaml @@ -0,0 +1,39 @@ +schema_version: 1 +id: safety.rm-suggest-trash +name: Suggest trash over rm for recoverable deletes +description: | + Denies Bash tool calls that match common dangerous shapes of `rm` + (recursive, force, recursive+force, or any flag combo containing -r/-R/-f) + and surfaces a nudge pointing the agent at `trash` / `trash-cli` for + recoverable deletes. The intent is recovery-friendly hygiene rather than + a hard block on the concept of deletion: the rule denies, the nudge tells + the agent how to retry safely, and the operator can still approve a + one-shot if a permanent recursive delete is genuinely required. +severity: medium +tags: + - safety + - bash + - nudge + - filesystem +authors: + - github: openagentlock +license: Apache-2.0 +compatible_agentlock: ">=0.1.0" +gate: + match: + tool: Bash + any_command_regex: + - '(?:^|[;&|`(]\s*|\bsudo\s+)rm\s+-[a-zA-Z]*[rRf][a-zA-Z]*\b' + - '(?:^|[;&|`(]\s*|\bsudo\s+)rm\s+-rf?\b' + - '(?:^|[;&|`(]\s*|\bsudo\s+)rm\s+-fr?\b' + - '(?:^|[;&|`(]\s*|\bsudo\s+)rm\s+--recursive\b' + - '(?:^|[;&|`(]\s*|\bsudo\s+)rm\s+--force\b' + evaluate: + - kind: always + action: deny + nudge: | + Prefer a recoverable delete: pipe the path through `trash` (https://github.com/sindresorhus/trash) + or `trash-cli` so the file lands in the OS trash and can be restored. If you genuinely need + `rm`, use `rm -i` so each removal is confirmed interactively rather than blasted recursively. + Permanent recursive deletes should be requested from the operator first — surface this nudge + verbatim and ask before retrying with `rm -rf`. diff --git a/rules/secret-read-suggest-skill/README.md b/rules/secret-read-suggest-skill/README.md new file mode 100644 index 0000000..c284911 --- /dev/null +++ b/rules/secret-read-suggest-skill/README.md @@ -0,0 +1,67 @@ +# `safety.secret-read-suggest-skill` + +Deny Read calls against canonical secret-bearing paths and nudge the agent +toward a dedicated `secret-fetcher` skill rather than an ambient file read. + +## What it catches + +The same path set as `rogue.secret-read`, plus `~/.ssh/config`: + +| Pattern | Why | +|---|---| +| `.env`, `.env.*` | App secrets | +| `.envrc` | direnv shell secrets | +| `.aws/credentials`, `.aws/config` | AWS access keys | +| `.ssh/id_*`, `.ssh/identity`, `.ssh/config` | SSH keys + host config | +| `.npmrc`, `.pypirc` | Registry tokens | +| `.netrc` | HTTP auth credentials | +| `.gnupg/*` | GPG private material | +| `kubeconfig`, `.kube/config` | Cluster admin tokens | + +## Why a nudge, not just a deny + +`rogue.secret-read` is the raw "no, don't read that" rule. This rule layers +the "use the right tool" pattern on top: instead of leaving the agent stuck +or escalating to the operator for every secret access, the verdict carries +the name of the skill that should handle it. The point is to demonstrate the +"force use of a skill" pattern — a deny that simultaneously *teaches* the +agent the correct path forward. + +> Note: `secret-fetcher` is illustrative — it's a placeholder for a future +> entry in the [openagentlock/skills](https://github.com/openagentlock/skills) +> repo. The exact skill name will firm up once that registry lands; the +> nudge text in this rule is the right shape but expect the skill id to +> change. Until then the second half of the nudge ("ask the operator to +> paste the value") is the actually-actionable fallback. + +## Example interaction + +```text +agent$ Read .env +daemon: deny (safety.secret-read-suggest-skill) + nudge: Don't read secret files directly into your context. If your + harness has the openagentlock/skills `secret-fetcher` skill + installed, invoke it — the skill brokers the value through a sealed + channel so the secret never lands in your prompt or tool-call + payloads. Otherwise, ask the operator to read the file and paste + only the specific value you need (key name + use case), and treat + anything they paste as sensitive — do not echo it back into logs or + reply text. + +agent: I need DATABASE_URL from .env to run the migration. Could you paste + just that one value? I won't log it. +``` + +## Pairing with `rogue.secret-read` + +If you install both rules, this one's nudge wins on overlap (same path set, +this rule has the actionable hint). You can pin only this rule for the +nudge-forward style, or only `rogue.secret-read` for the terse deny — pick +based on whether your operators want the agent to self-recover. + +## Test it + +```bash +agentlock fake-hook --session --tool Read --path '.env' +# expect: deny with the secret-fetcher nudge in the verdict +``` diff --git a/rules/secret-read-suggest-skill/rule.yaml b/rules/secret-read-suggest-skill/rule.yaml new file mode 100644 index 0000000..810b12e --- /dev/null +++ b/rules/secret-read-suggest-skill/rule.yaml @@ -0,0 +1,49 @@ +schema_version: 1 +id: safety.secret-read-suggest-skill +name: Force a skill for secret reads +description: | + Denies Read tool calls against canonical secret-bearing paths + (.env / .env.*, ~/.ssh/*, ~/.aws/credentials, kubeconfig, .netrc, etc.) + and surfaces a nudge directing the agent to a dedicated `secret-fetcher` + skill instead of an ambient file read. Where `rogue.secret-read` is the + raw block, this rule layers on the "use the right tool" nudge — keeping + secret material out of the model's context window unless a purpose-built + skill brokers it. +severity: high +tags: + - safety + - secrets + - read + - nudge + - skill +authors: + - github: openagentlock +license: Apache-2.0 +compatible_agentlock: ">=0.1.0" +gate: + match: + tool: Read + any_path_regex: + - '(^|/)\.env(\.[^/]+)?$' + - '(^|/)\.envrc$' + - '(^|/)\.aws/credentials$' + - '(^|/)\.aws/config$' + - '(^|/)\.ssh/id_(rsa|ed25519|ecdsa|dsa)(\.pub)?$' + - '(^|/)\.ssh/identity$' + - '(^|/)\.ssh/config$' + - '(^|/)\.npmrc$' + - '(^|/)\.pypirc$' + - '(^|/)\.netrc$' + - '(^|/)\.gnupg/.*' + - '(^|/)kubeconfig$' + - '(^|/)\.kube/config$' + evaluate: + - kind: always + action: deny + nudge: | + Don't read secret files directly into your context. If your harness has the + openagentlock/skills `secret-fetcher` skill installed, invoke it — the skill + brokers the value through a sealed channel so the secret never lands in your + prompt or tool-call payloads. Otherwise, ask the operator to read the file + and paste only the specific value you need (key name + use case), and treat + anything they paste as sensitive — do not echo it back into logs or reply text. diff --git a/schema/rule.schema.json b/schema/rule.schema.json index 33017e8..449695d 100644 --- a/schema/rule.schema.json +++ b/schema/rule.schema.json @@ -101,6 +101,11 @@ "action": { "type": "string", "enum": ["allow", "deny", "monitor", "warn"] + }, + "nudge": { + "type": "string", + "maxLength": 2000, + "description": "Optional human-readable hint surfaced alongside the verdict. The daemon pipes this through to the agent harness so the model can recover (e.g. 'use trash instead of rm -rf'). Plain text, typically 1–4 sentences." } }, "additionalProperties": true