feat: content security — 4-layer prompt injection defense for pair-agent by garrytan · Pull Request #815 · garrytan/gstack

garrytan · 2026-04-05T06:22:32Z

Summary

Four-layer defense-in-depth for prompt injection attacks when remote AI agents browse untrusted web pages via /pair-agent.

Content Security (Phase 2)

Content envelope wrapping with ZWSP marker escaping
Hidden element stripping (7 CSS techniques + ARIA injection detection)
Datamarking (session-scoped text watermarking)
Content filter hooks (extensible pipeline, URL blocklist, warn/block modes)
Snapshot split format (trusted @refs above untrusted content)
SECURITY section in pair-agent instruction block

Infrastructure

handleCommandInternal refactor: chain subcommands get full security pipeline
Centralized content wrapping (was 6 call sites, now 1)
attrs added to PAGE_CONTENT_COMMANDS

Test Compliance

47 new content security tests
4 injection test fixture HTML pages
Fixed all 16 pre-existing test failures (pair-agent compliance, golden files, host smoke tests, relink timeouts)

Test Coverage

47 new tests in browse/test/content-security.test.ts covering all 4 defense layers
325 total browse tests pass (0 fail)
Full bun test suite: 0 in-branch failures

E2E Results

31/33 E2E tests passed ($8.90 total cost):

Core: 3/3 PASS
Browse: 7/7 PASS
Plan: 11/12 (1 transient API error on plan-ceo-review-selective)
Review: 8/8 PASS
Gemini: 2/3 (1 worktree env failure, pre-existing)

Pre-Landing Review

Eng Review: CLEAR (4 runs, most recent 2026-04-05 via /autoplan + standalone)
CEO Review: CLEAR (via /autoplan, SELECTIVE EXPANSION mode)

Test plan

All browse tests pass (325 tests, 0 failures)
Full bun test suite passes (0 in-branch failures)
Content security tests cover all 4 defense layers
E2E tests pass (31/33, 2 pre-existing/transient failures)
Hidden element stripping false positive check passes
Envelope marker escaping prevents boundary escape

🤖 Generated with Claude Code

Per-agent scoped tokens with read/write/admin/meta command categories, domain glob restrictions, rate limiting, expiry, and revocation. Setup key exchange for the /pair-agent ceremony (5-min one-time key → 24h session token). Idempotent exchange handles tunnel drops. 39 tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Server changes for multi-agent browser access: - /connect endpoint: setup key exchange for /pair-agent ceremony - /token endpoint: root-only minting of scoped sub-tokens - /token/:clientId DELETE: revoke agent tokens - /agents endpoint: list connected agents (root-only) - /health: strips root token when tunnel is active (P0 security fix) - /command: scope/rate/domain checks via token registry before dispatch - Idle timer skips shutdown when tunnel is active Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

BROWSE_TUNNEL=1 env var starts an ngrok tunnel after Bun.serve(). Reads NGROK_AUTHTOKEN from env or ~/.gstack/ngrok.env. Reads NGROK_DOMAIN for dedicated domain (stable URL). Updates state file with tunnel URL. Feasibility spike confirmed: SDK works in compiled Bun binary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add per-tab ownership tracking to BrowserManager. Scoped agents must create their own tab via newtab before writing. Unowned tabs (pre-existing, user-opened) are root-only for writes. Read access always allowed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Server-side tab ownership check blocks scoped agents from writing to unowned tabs. Special-case newtab records ownership for scoped tokens. POST /pair endpoint creates setup keys for the pairing ceremony. Activity events now include clientId for attribution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

One command to pair a remote agent: $B pair-agent. Creates a setup key via POST /pair, prints a copy-pasteable instruction block with curl commands. Smart tunnel fallback (tunnel URL > auto-start > localhost). Flags: --for HOST, --local HOST, --admin, --client NAME. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

14 tests covering tab ownership lifecycle (access checks, unowned tabs, transferTab) and instruction block generator (scopes, URLs, admin flag, troubleshooting section). Fix server-auth test that used fragile sliceBetween boundaries broken by new endpoints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

1. Remove root token from /health endpoint entirely (CSO #1 CRITICAL). Origin header is spoofable. Extension reads from ~/.gstack/.auth.json. 2. Add domain check for newtab URL (CSO #5). Previously only goto was checked, allowing domain-restricted agents to bypass via newtab. 3. Validate scope values, rateLimit, expiresSeconds in createToken() (CSO #4). Rejects invalid scopes and negative values. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Users remember /pair-agent, not $B pair-agent. The skill walks through agent selection (OpenClaw, Hermes, Codex, Cursor, generic), local vs remote setup, tunnel configuration, and includes platform-specific notes for each agent type. Wraps the CLI command with context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@ref

Full API reference, snapshot→@ref pattern, scopes, tab isolation, error codes, ngrok setup, and same-machine shortcuts. The instruction block points here for deeper reading. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@ref

The paste-into-agent instruction block now teaches the snapshot→@ref workflow (the most powerful browsing pattern), shows the server URL prominently, and uses clearer formatting. Tests updated to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The pair-agent command now checks ngrok's native config (not just ~/.gstack/ngrok.env) and auto-starts the tunnel when ngrok is available. The skill template walks users through ngrok install and auth if not set up, instead of just printing a dead localhost URL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

pair-agent now auto-starts the ngrok tunnel without restarting the server. New POST /tunnel/start endpoint reads authtoken from env, ~/.gstack/ngrok.env, or ngrok's native config. CLI detects ngrok availability and calls the endpoint automatically. Zero manual steps when ngrok is installed and authed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Added CRITICAL instruction: the agent MUST output the full instruction block so the user can copy it. Previously the agent could summarize over it, leaving the user with nothing to paste. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The blanket validateAuth() gate (root-only) sat above the /command endpoint, rejecting all scoped tokens with 401 before they reached getTokenInfo(). Moved /command above the gate so both root and scoped tokens are accepted. This was the bug Wintermute hit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When pair-agent detects headless mode, it auto-switches to headed (visible Chromium window) so the user can watch what the remote agent does. Use --headless to skip this. Fixed compiled binary path resolution (process.execPath, not process.argv[1] which is virtual /$bunfs/ in Bun compiled binaries). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@ref

16 new tests covering: - /command sits above blanket auth gate (Wintermute bug) - /command uses getTokenInfo not validateAuth - /tunnel/start requires root, checks native ngrok config, returns already_active - /pair creates setup keys not session tokens - Tab ownership checked before command dispatch - Activity events include clientId - Instruction block teaches snapshot→@ref pattern - pair-agent auto-headed mode, process.execPath, --headless skip - isNgrokAvailable checks all 3 sources (gstack env, env var, native config) - handlePairAgent calls /tunnel/start not server restart Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

1. Chain command now pre-validates ALL subcommand scopes before executing any. A read+meta token can no longer escalate to admin via chain (eval, js, cookies were dispatched without scope checks). tokenInfo flows through handleMetaCommand into the chain handler. Rejects entire chain if any subcommand fails. 2. /health strips sensitive fields (currentUrl, agent.currentMessage, session) when tunnel is active. Only operational metadata (status, mode, uptime, tabs) exposed to the internet. Previously anyone reaching the ngrok URL could surveil browsing activity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Lead with what it does for the user: type /pair-agent, paste into your other agent, done. First time AI agents from different companies can coordinate through a shared browser with real security boundaries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Each skill gets a real narrative paragraph explaining the workflow, not just a table cell. design-shotgun: visual exploration with taste memory. design-html: production HTML with Pretext computed layout. pair-agent: cross-vendor AI agent coordination through shared browser. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Chain subcommands now route through handleCommandInternal for full security enforcement (scope, domain, tab ownership, rate limiting, content wrapping). Adds recursion guard for nested chains, rate-limit exemption for chain subcommands, and activity event suppression (1 event per chain, not per sub). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…hooks Four-layer prompt injection defense for pair-agent browser sharing: - Datamarking: session-scoped watermark for text exfiltration detection - Content envelope: trust boundary wrapping with ZWSP marker escaping - Content filter hooks: extensible filter pipeline with warn/block modes - Built-in URL blocklist: requestbin, pipedream, webhook.site, etc. BROWSE_CONTENT_FILTER env var controls mode: off|warn|block (default: warn) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Single wrapping location replaces fragmented per-handler wrapping: - Scoped tokens: content filters + datamarking + enhanced envelope - Root tokens: existing basic wrapping (backward compat) - Chain subcommands exempt from top-level wrapping (wrapped individually) - Adds 'attrs' to PAGE_CONTENT_COMMANDS (ARIA value exposure defense) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Detects CSS-hidden elements (opacity, font-size, off-screen, same-color, clip-path) and ARIA label injection patterns. Marks elements with data-gstack-hidden, extracts text from a clean clone (no DOM mutation), then removes markers. Only active for scoped tokens on text command. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@refs

Scoped tokens get a split snapshot: trusted @refs section (for click/fill) separated from untrusted web content in an envelope. Ref names truncated to 50 chars in trusted section. Root tokens unchanged (backward compat). Resume command also uses split format for scoped tokens. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@refs

Instructs remote agents to treat content inside untrusted envelopes as potentially malicious. Lists common injection phrases to watch for. Directs agents to only use @refs from the trusted INTERACTIVE ELEMENTS section, not from page content. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- injection-visible.html: visible injection in product review text - injection-hidden.html: 7 CSS hiding techniques + ARIA injection + false positive - injection-social.html: social engineering in legitimate-looking content - injection-combined.html: all attack types + envelope escape attempt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Covers all 4 defense layers: - Datamarking: marker format, session consistency, text-only application - Content envelope: wrapping, ZWSP marker escaping, filter warnings - Content filter hooks: URL blocklist, custom filters, warn/block modes - Instruction block: SECURITY section content, ordering, generation - Centralized wrapping: source-level verification of integration - Chain security: recursion guard, rate-limit exemption, activity suppression - Hidden element stripping: 7 CSS techniques, ARIA injection, false positives - Snapshot split format: scoped vs root output, resume integration Also fixes: visibility:hidden detection, case-insensitive ARIA pattern matching. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Root cause: pair-agent was added without completing the gen-skill-docs compliance checklist. All 16 failures traced back to this. Fixes: - Sync package.json version to VERSION (0.15.9.0) - Add "(gstack)" to pair-agent description for discoverability - Add pair-agent to Codex path exception (legitimately documents ~/.codex/) - Add CLI_COMMANDS (status, pair-agent, tunnel) to skill parser allowlist - Regenerate SKILL.md for all hosts (claude, codex, factory, kiro, etc.) - Update golden file baselines for ship skill - Fix relink tests: pass GSTACK_INSTALL_DIR to auto-relink calls so they use the fast mock install instead of scanning real ~/.claude/skills/gstack Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…wser-ctrl # Conflicts: # CHANGELOG.md # VERSION

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two fixes for E2E test reliability: 1. session-runner.ts: error_max_turns was misclassified as error_api because is_error flag was checked before subtype. Now known subtypes like error_max_turns are preserved even when is_error is set. The is_error override only applies when subtype=success (API failure). 2. worktree.ts: pruneStale() now skips worktrees < 1 hour old to avoid deleting worktrees from concurrent test runs still in progress. Previously any second test execution would kill the first's worktrees. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The CSO security fix stripped the token from /health to prevent leaking when tunneled. But the extension needs it to authenticate on localhost. Now returns token only when not tunneled (safe: localhost-only path). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…nnel Updated tests to match the restored token behavior: - Test 1: token assignment exists AND is inside the !tunnelActive guard - Test 1b: tunnel branch (else block) does not contain AUTH_TOKEN Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Explains why this is an accepted risk (no escalation over file-based token access), CORS protection, and tunnel guard. Prevents future CSO scans from stripping it without providing an alternative auth path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Root cause: when ngrok dies externally (pkill, crash, timeout), the server still reports tunnelActive=true with a dead URL. pair-agent prints an instruction block pointing at a dead tunnel. The remote agent gets "endpoint offline" and the user has to manually restart everything. Three-layer fix: - Server /pair endpoint: probes tunnel URL before returning it. If dead, resets tunnelActive/tunnelUrl and returns null (triggers CLI restart). - Server /tunnel/start: probes cached tunnel before returning already_active. If dead, falls through to restart ngrok automatically. - CLI pair-agent: double-checks tunnel URL from server before printing instruction block. Falls through to auto-start on failure. 4 regression tests verify all three probe points + CLI verification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…wser-ctrl # Conflicts: # CHANGELOG.md

Remote agents controlling GStack Browser through a tunnel pay 2-5s of latency per HTTP round-trip. A typical "navigate and read" takes 4 sequential commands = 10-20 seconds. The /batch endpoint collapses N commands into a single HTTP round-trip, cutting a 20-tab crawl from ~60s to ~5s. Sequential execution through the full security pipeline (scope, domain, tab ownership, content wrapping). Rate limiting counts the batch as 1 request. Activity events emitted at batch level, not per-command. Max 50 commands per batch. Nested batches rejected. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

8 tests verifying: auth gate placement, scoped token support, max command limit, nested batch rejection, rate limiting bypass, batch-level activity events, command field validation, and tabId passthrough. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Hermes doesn't have a host-specific config — it uses the same generic curl instructions as any other agent. Removing the dedicated option simplifies the menu and eliminates a misleading distinction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…wser-ctrl Resolved CHANGELOG.md conflict: kept main's 0.15.13.0 Team Mode entry on top, preserved branch's richer 0.15.12.0 Content Security entry below.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…wser-ctrl Resolved CHANGELOG.md conflict: main landed 0.15.14.0, bumped our branch to 0.15.15.0 with the batch endpoint entry on top. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Vendoring deprecation section from main's template wasn't reflected in the generated file. Fixes check-freshness CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Refactors checkTabAccess(tabId, clientId, isWrite) to use an options object { isWrite?, ownOnly? }. Adds tabPolicy === 'own-only' support in the server command dispatch — scoped tokens with this policy are restricted to their own tabs for all commands, not just writes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Allows passing --domain to pair-agent to restrict the remote agent's navigation to specific domains (comma-separated). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-06T07:35:15Z

E2E Evals: ✅ PASS

61/61 tests passed | $6.19 total cost | 12 parallel runners

Suite	Result	Status	Cost
e2e-browse	7/7	✅	$0.27
e2e-deploy	6/6	✅	$1.1
e2e-design	3/3	✅	$0.4
e2e-plan	7/7	✅	$1.15
e2e-qa-workflow	3/3	✅	$1.15
e2e-review	6/6	✅	$1.13
e2e-workflow	4/4	✅	$0.49
llm-judge	25/25	✅	$0.5

12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

The batch endpoint work belongs on the browser-batch-multitab branch (port-louis), not this branch. Reverting VERSION to 0.15.14.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…wser-ctrl Resolved conflicts: - meta-commands.ts: kept our security pipeline (scope pre-validation + executeCommand callback) and integrated main's watch-mode blocking for chain write commands - server.ts: kept our !tunnelActive guard with security documentation over main's headed-mode detection approach - package.json: took main's version Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Our merge kept the old !tunnelActive guard which conflicted with main's security-audit-r2 tests that require no currentUrl/currentMessage in /health. Adopts main's approach: serve token conditionally based on headed mode or chrome-extension origin. Updates server-auth tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adds $B placeholder explanation, explicit syntax line, and detailed flag behavior (-d depth values, -s CSS selector syntax, -D unified diff format and baseline persistence, -a screenshot vs text output relationship). Fixes snapshot flags reference LLM eval scoring completeness < 4. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

garrytan and others added 30 commits April 4, 2026 16:47

merge: resolve conflicts with main (adopt chrome-extension origin gat…

28b7301

…ing)

chore: bump version and changelog (v0.15.9.0)

bda0cfd

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

garrytan and others added 3 commits April 5, 2026 12:22

Merge remote-tracking branch 'origin/main' into garrytan/openclaw-bro…

e8ef9a5

…wser-ctrl # Conflicts: # CHANGELOG.md # VERSION

chore: bump version and changelog (v0.15.12.0)

8801a62

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

garrytan changed the title ~~feat: multi-agent browser platform — token registry, tab isolation, pair-agent (v0.15.9.0)~~ feat: content security — 4-layer prompt injection defense for pair-agent Apr 5, 2026

garrytan and others added 14 commits April 5, 2026 15:45

Merge remote-tracking branch 'origin/main' into garrytan/openclaw-bro…

d384b09

…wser-ctrl # Conflicts: # CHANGELOG.md

fix: correct CHANGELOG date from 2026-04-06 to 2026-04-05

21f2a44

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into garrytan/openclaw-bro…

11d3928

…wser-ctrl Resolved CHANGELOG.md conflict: kept main's 0.15.13.0 Team Mode entry on top, preserved branch's richer 0.15.12.0 Content Security entry below.

chore: bump VERSION to 0.15.14.0, add CHANGELOG entry for batch endpoint

170be8d

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into garrytan/openclaw-bro…

8fd73ec

…wser-ctrl Resolved CHANGELOG.md conflict: main landed 0.15.14.0, bumped our branch to 0.15.15.0 with the batch endpoint entry on top. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: regenerate pair-agent/SKILL.md after main merge

7cf7f6e

Vendoring deprecation section from main's template wasn't reflected in the generated file. Fixes check-freshness CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Repository owner deleted a comment from github-actions bot Apr 6, 2026

garrytan and others added 2 commits April 6, 2026 00:34

feat: add --domain flag to pair-agent CLI for domain restrictions

100c406

Allows passing --domain to pair-agent to restrict the remote agent's navigation to specific domains (comma-separated). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

garrytan and others added 4 commits April 6, 2026 00:41

revert: remove batch commands CHANGELOG entry and VERSION bump

3acbd4a

The batch endpoint work belongs on the browser-batch-multitab branch (port-louis), not this branch. Reverting VERSION to 0.15.14.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: content security — 4-layer prompt injection defense for pair-agent#815

feat: content security — 4-layer prompt injection defense for pair-agent#815
garrytan wants to merge 53 commits intomainfrom
garrytan/openclaw-browser-ctrl

garrytan commented Apr 5, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Coverage

E2E Results

Pre-Landing Review

Test plan

Uh oh!

github-actions bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Evals: ✅ PASS

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

garrytan commented Apr 5, 2026 •

edited

Loading

github-actions bot commented Apr 6, 2026 •

edited

Loading