diff --git a/.claude/agents/security-reviewer.md b/.claude/agents/security-reviewer.md index a56250453..04d4b1380 100644 --- a/.claude/agents/security-reviewer.md +++ b/.claude/agents/security-reviewer.md @@ -1,10 +1,16 @@ +--- +name: security-reviewer +description: Reviews findings from AgentShield + zizmor against the project's CLAUDE.md security rules and grades the result A-F. Spawned by the security-scan skill after the static scans run. +tools: Read, Grep, Glob, Bash(git:*), Bash(rg:*), Bash(grep:*), Bash(find:*), Bash(ls:*), Bash(pnpm exec agentshield:*), Bash(zizmor:*), Bash(command -v:*), Bash(cat:*), Bash(head:*), Bash(tail:*) +--- + You are a security reviewer for Socket Security Node.js repositories. Apply these rules from CLAUDE.md exactly: **Safe File Operations**: Use safeDelete()/safeDeleteSync() from @socketsecurity/lib/fs. NEVER fs.rm(), fs.rmSync(), or rm -rf. Use os.tmpdir() + fs.mkdtemp() for temp dirs. NEVER use fetch() — use httpJson/httpText/httpRequest from @socketsecurity/lib/http-request. -**Absolute Rules**: NEVER use npx, pnpm dlx, or yarn dlx. Use pnpm exec or pnpm run with pinned devDeps. +**Absolute Rules**: NEVER use npx, pnpm dlx, or yarn dlx. Use pnpm exec or pnpm run with pinned devDeps. # zizmor: documentation-prohibition **Work Safeguards**: Scripts modifying multiple files must have backup/rollback. Git operations that rewrite history require explicit confirmation. @@ -12,7 +18,7 @@ Apply these rules from CLAUDE.md exactly: 1. **Secrets**: Hardcoded API keys, passwords, tokens, private keys in code or config 2. **Injection**: Command injection via shell: true or string interpolation in spawn/exec. Path traversal in file operations. -3. **Dependencies**: npx/dlx usage. Unpinned versions (^ or ~). Missing minimumReleaseAge bypass justification. +3. **Dependencies**: npx/dlx usage. Unpinned versions (^ or ~). Missing soak-window bypass justification (pnpm-workspace.yaml `minimumReleaseAgeExclude`). # zizmor: documentation-checklist 4. **File operations**: fs.rm without safeDelete. process.chdir usage. fetch() usage (must use lib's httpRequest). 5. **GitHub Actions**: Unpinned action versions (must use full SHA). Secrets outside env blocks. Template injection from untrusted inputs. 6. **Error handling**: Sensitive data in error messages. Stack traces exposed to users. diff --git a/.claude/hooks/auth-rotation-reminder/README.md b/.claude/hooks/auth-rotation-reminder/README.md new file mode 100644 index 000000000..fb2788422 --- /dev/null +++ b/.claude/hooks/auth-rotation-reminder/README.md @@ -0,0 +1,131 @@ +# auth-rotation-reminder + +Claude Code `Stop` hook that periodically logs you out of authenticated +CLIs (npm, pnpm, gcloud, vault, aws sso, docker, socket, …) so stale +long-lived tokens don't sit in your dotfiles or keychain for days. + +## Why + +Long-lived auth tokens live in well-known locations: `~/.npmrc`, +`~/.config/gh/hosts.yml`, `~/.config/gcloud/`, `~/.docker/config.json`. +A compromised dev workstation has a wide blast radius on those files. +Periodic auto-revocation tightens the window and forces explicit +re-authentication, which is itself a small phishing-defense moment +("did I really mean to publish?"). + +## Defaults + +- **Interval**: 1 hour. Set `SOCKET_AUTH_ROTATION_INTERVAL_HOURS=4` to + loosen, `=0` to run on every Stop event. +- **Mode**: auto-logout (the hook *acts*, not just warns). +- **Default skip-list**: `gh` is skipped because Claude Code itself + uses `gh` for `gh pr edit` etc. — auto-revoking it would break the + agent. +- **CI**: hook short-circuits when `CI` env var is set. + +## What's swept + +| id | display name | detect | logout | +| --------- | ----------------- | ----------------- | ------------------------------ | +| npm | npm | `npm whoami` | `npm logout` | +| pnpm | pnpm | `pnpm whoami` | `pnpm logout` | +| yarn | yarn | `yarn --version` | `yarn npm logout` | +| gcloud | gcloud | `gcloud auth list ... ACTIVE` | `gcloud auth revoke --all --quiet` | +| aws-sso | aws (sso) | `aws sts get-caller-identity` | `aws sso logout` | +| gh | gh (GitHub CLI) | `gh auth status` | `gh auth logout --hostname github.com` | +| vault | vault | `vault token lookup` | `vault token revoke -self` | +| docker | docker | `docker info \| grep Username:` | `docker logout` | +| socket | socket | `socket whoami` | `socket logout` | + +The hook never reads, prints, or compares any token value. Detection +is exit-code only; logout commands' output is suppressed except for +non-zero exit codes which surface as "logout failed" lines. + +## Snoozing + +Need to keep your auth alive for the next few hours (e.g. mid-publish)? +Drop a `.snooze` file with an ISO 8601 expiry on line 1. + +```bash +# Snooze for 4 hours, project-local +date -ud "+4 hours" +"%Y-%m-%dT%H:%M:%SZ" > .claude/auth-rotation.snooze + +# Snooze globally for 8 hours (applies to every repo) +mkdir -p ~/.claude/hooks/auth-rotation +date -ud "+8 hours" +"%Y-%m-%dT%H:%M:%SZ" > ~/.claude/hooks/auth-rotation/snooze +``` + +The hook **automatically deletes the file** once the timestamp is +reached. No manual cleanup needed. + +Snoozes that are malformed, empty, or unreadable are also auto-deleted +on the next run — fail-safe so a corrupted file can't permanently +disable rotation. + +`.claude/*.snooze` is gitignored; project-local snoozes never leak into +commits. + +## Skip-list + +Permanently skip a service: + +```bash +# Per-user: applies to every repo +mkdir -p ~/.claude/hooks/auth-rotation +echo gcloud >> ~/.claude/hooks/auth-rotation/services-skip + +# Per-repo: applies just to this checkout +echo vault >> .claude/auth-rotation.services-skip +``` + +One id per line. Lines starting with `#` are comments. Service ids +are stable — see the table above. + +## Disable temporarily + +```bash +SOCKET_AUTH_ROTATION_DISABLED=1 # any non-empty value +``` + +For pairing sessions, demos, etc. The hook short-circuits before +doing any work. + +## Wiring + +In `.claude/settings.json`: + +```json +{ + "hooks": { + "Stop": [ + { + "hooks": [ + { + "type": "command", + "command": "node .claude/hooks/auth-rotation-reminder/index.mts" + } + ] + } + ] + } +} +``` + +## Tests + +```bash +cd .claude/hooks/auth-rotation-reminder +node --test test/*.test.mts +``` + +## Reusing the snooze convention + +Other hooks can adopt the same `.snooze` pattern. The convention is: + +- Filename: `.claude/.snooze` (project) or + `~/.claude/hooks//snooze` (global). +- Format: ISO 8601 expiry on line 1. Optional further lines ignored. +- `.gitignore`: `.claude/*.snooze`. +- Cleanup: hook auto-deletes expired files via `safeDelete`. +- The `checkSnoozes` / `tryUnlink` helpers in `index.mts` are easy to + copy into a sibling hook. diff --git a/.claude/hooks/auth-rotation-reminder/index.mts b/.claude/hooks/auth-rotation-reminder/index.mts new file mode 100644 index 000000000..06e053566 --- /dev/null +++ b/.claude/hooks/auth-rotation-reminder/index.mts @@ -0,0 +1,353 @@ +#!/usr/bin/env node +// Claude Code Stop hook — auth-rotation-reminder. +// +// Periodically logs you out of authenticated CLIs (npm, pnpm, gcloud, +// vault, aws sso, docker, socket, …) so stale long-lived tokens don't +// sit in dotfiles or keychains for days. +// +// Behavior on each Stop event: +// +// 1. Drain stdin (Stop hook delivers a JSON payload we don't need). +// 2. Skip if running in CI (CI auth has its own lifecycle). +// 3. Read both global + project-local `.snooze` files. Each carries +// an ISO 8601 expiry on line 1; if past, the file is auto-cleaned +// and the hook proceeds. If unexpired, the hook honors the snooze +// and exits silently. +// 4. Throttle via a state file: if the last successful run was within +// the configured interval (default 1h), exit silently. +// 5. For each service in services.mts: +// a. Skip if the binary is missing and `optional: true`. +// b. Run detectCmd. Skip if not authenticated. +// c. Run logoutCmd. Log to stderr via lib's logger. +// 6. Update the state file's mtime. +// +// The hook NEVER reads, prints, or compares any token value. Detection +// is exit-code only; logout commands' output is suppressed except for +// non-zero exit codes which surface as "logout failed" lines. +// +// Snooze file format (ISO 8601 timestamp on line 1): +// +// $ date -ud '+4 hours' +"%Y-%m-%dT%H:%M:%SZ" > .claude/auth-rotation.snooze +// +// Removed automatically once the timestamp is reached. +// +// Configuration env vars (all optional): +// +// SOCKET_AUTH_ROTATION_INTERVAL_HOURS default: 1 +// How long between actual auth-rotation runs (state-file throttle). +// Set to 0 to run on every Stop event (verbose). +// +// SOCKET_AUTH_ROTATION_DISABLED default: unset +// If set to a truthy value, skip the hook entirely. + +import { spawnSync } from 'node:child_process' +import { + existsSync, + mkdirSync, + readFileSync, + statSync, + utimesSync, + writeFileSync, +} from 'node:fs' +import { homedir } from 'node:os' +import path from 'node:path' +import process from 'node:process' + +import { safeDelete } from '@socketsecurity/lib/fs' +import { getDefaultLogger } from '@socketsecurity/lib/logger' + +import { DEFAULT_SKIP_IDS, SERVICES } from './services.mts' +import type { Service } from './services.mts' + +const logger = getDefaultLogger() +const PREFIX = '[auth-rotation-reminder]' + +// ── Paths ─────────────────────────────────────────────────────────── + +const STATE_DIR = path.join(homedir(), '.claude', 'hooks', 'auth-rotation') +const STATE_FILE = path.join(STATE_DIR, 'last-run') +const GLOBAL_SNOOZE = path.join(STATE_DIR, 'snooze') +const GLOBAL_SKIP_LIST = path.join(STATE_DIR, 'services-skip') + +// Project-local files live at the repo root next to .claude/. Claude +// Code spawns Stop hooks with the working directory set to the repo +// root so process.cwd() is reliable here. +const PROJECT_SNOOZE = path.join( + process.cwd(), + '.claude', + 'auth-rotation.snooze', +) +const PROJECT_SKIP_LIST = path.join( + process.cwd(), + '.claude', + 'auth-rotation.services-skip', +) + +// ── Snooze handling ───────────────────────────────────────────────── + +interface SnoozeStatus { + active: boolean + cleaned: string[] +} + +async function checkSnoozes(): Promise { + const status: SnoozeStatus = { active: false, cleaned: [] } + const cleanFile = async (file: string, reason: string): Promise => { + try { + await safeDelete(file) + status.cleaned.push(file) + } catch (e) { + logger.error( + `${PREFIX} safeDelete(${path.basename(file)}) failed (${reason}): ${(e as Error).message}`, + ) + } + } + for (const file of [GLOBAL_SNOOZE, PROJECT_SNOOZE]) { + if (!existsSync(file)) { + continue + } + let content = '' + try { + content = readFileSync(file, 'utf8').trim() + } catch { + await cleanFile(file, 'unreadable') + continue + } + // Empty content = legacy form, no expiry. Treat as expired now. + if (content.length === 0) { + await cleanFile(file, 'legacy (no expiry)') + continue + } + const firstLine = content.split('\n')[0]!.trim() + const expiry = Date.parse(firstLine) + if (Number.isNaN(expiry)) { + await cleanFile(file, 'malformed expiry') + continue + } + if (Date.now() >= expiry) { + await cleanFile(file, 'expired') + continue + } + // Unexpired snooze. Honor it. + status.active = true + return status + } + return status +} + +// ── Skip-list ─────────────────────────────────────────────────────── + +function loadSkipIds(): Set { + const skipIds = new Set(DEFAULT_SKIP_IDS) + for (const file of [GLOBAL_SKIP_LIST, PROJECT_SKIP_LIST]) { + if (!existsSync(file)) { + continue + } + try { + const content = readFileSync(file, 'utf8') + for (const raw of content.split('\n')) { + const trimmed = raw.trim() + if (trimmed && !trimmed.startsWith('#')) { + skipIds.add(trimmed) + } + } + } catch { + // Ignore unreadable skip-list — better to over-rotate than fail closed. + } + } + return skipIds +} + +// ── Throttle ──────────────────────────────────────────────────────── + +function intervalMs(): number { + const raw = process.env['SOCKET_AUTH_ROTATION_INTERVAL_HOURS'] + const hours = raw === undefined ? 1 : Number.parseFloat(raw) + if (!Number.isFinite(hours) || hours < 0) { + return 60 * 60 * 1000 + } + return Math.round(hours * 60 * 60 * 1000) +} + +function withinThrottle(): boolean { + const interval = intervalMs() + if (interval === 0) { + return false + } + if (!existsSync(STATE_FILE)) { + return false + } + try { + const { mtimeMs } = statSync(STATE_FILE) + return Date.now() - mtimeMs < interval + } catch { + return false + } +} + +function touchStateFile(): void { + try { + mkdirSync(STATE_DIR, { recursive: true }) + if (!existsSync(STATE_FILE)) { + writeFileSync(STATE_FILE, '') + } + const now = new Date() + utimesSync(STATE_FILE, now, now) + } catch { + // Throttle is best-effort. Loss = hook runs more often than configured; + // not worth surfacing. + } +} + +// ── Service detection + logout ────────────────────────────────────── + +interface RotationResult { + loggedOut: string[] + failed: Array<{ service: string; reason: string }> + skippedMissing: string[] +} + +function isOnPath(binary: string): boolean { + // `command -v` is portable across sh/bash/zsh and exits 0 if found. + const r = spawnSync('sh', ['-c', `command -v ${binary} >/dev/null 2>&1`], { + stdio: 'ignore', + }) + return r.status === 0 +} + +function isAuthenticated(s: Service): boolean { + const r = spawnSync(s.detectCmd[0]!, s.detectCmd.slice(1) as string[], { + stdio: 'ignore', + timeout: 5000, + }) + return r.status === 0 +} + +function runLogout(s: Service): { ok: boolean; reason?: string } { + const r = spawnSync(s.logoutCmd[0]!, s.logoutCmd.slice(1) as string[], { + stdio: 'ignore', + timeout: 10_000, + }) + if (r.status === 0) { + return { ok: true } + } + if (r.error) { + return { ok: false, reason: r.error.message } + } + return { ok: false, reason: `exit code ${r.status}` } +} + +function rotateAll(skipIds: Set): RotationResult { + const result: RotationResult = { + loggedOut: [], + failed: [], + skippedMissing: [], + } + for (const service of SERVICES) { + if (skipIds.has(service.id)) { + continue + } + if (!isOnPath(service.detectCmd[0]!)) { + if (!service.optional) { + result.skippedMissing.push(service.name) + } + continue + } + if (!isAuthenticated(service)) { + continue + } + const out = runLogout(service) + if (out.ok) { + result.loggedOut.push(service.name) + } else { + result.failed.push({ + service: service.name, + reason: out.reason ?? 'unknown', + }) + } + } + return result +} + +// ── Output ────────────────────────────────────────────────────────── + +function reportSnoozeCleaned(cleaned: string[]): void { + for (const file of cleaned) { + logger.error(`${PREFIX} cleared expired snooze: ${file}`) + } +} + +function reportRotation(result: RotationResult): void { + const parts: string[] = [] + if (result.loggedOut.length > 0) { + parts.push( + `logged out of ${result.loggedOut.length} CLI(s): ${result.loggedOut.join(', ')}`, + ) + } + if (result.failed.length > 0) { + const failed = result.failed + .map(f => `${f.service} (${f.reason})`) + .join(', ') + parts.push(`logout failed: ${failed}`) + } + if (result.skippedMissing.length > 0) { + parts.push(`expected-but-missing: ${result.skippedMissing.join(', ')}`) + } + if (parts.length === 0) { + return + } + logger.error(`${PREFIX} ${parts.join('; ')}`) + logger.error( + ` Snooze for next 4h: date -ud "+4 hours" +"%Y-%m-%dT%H:%M:%SZ" > .claude/auth-rotation.snooze`, + ) +} + +// ── Main ──────────────────────────────────────────────────────────── + +async function run(): Promise { + if (process.env['CI']) { + return + } + if (process.env['SOCKET_AUTH_ROTATION_DISABLED']) { + return + } + const snooze = await checkSnoozes() + reportSnoozeCleaned(snooze.cleaned) + if (snooze.active) { + return + } + if (withinThrottle()) { + return + } + const skipIds = loadSkipIds() + const result = rotateAll(skipIds) + reportRotation(result) + touchStateFile() +} + +function main(): void { + // Drain stdin so Node doesn't keep us alive waiting on the Stop hook's + // JSON payload (we don't read its contents). + process.stdin.resume() + process.stdin.on('data', () => {}) + process.stdin.on('end', () => { + run() + .catch(e => { + logger.error(`${PREFIX} unexpected error: ${(e as Error).message}`) + }) + .finally(() => { + process.exit(0) + }) + }) + if (process.stdin.readable === false) { + run() + .catch(e => { + logger.error(`${PREFIX} unexpected error: ${(e as Error).message}`) + }) + .finally(() => { + process.exit(0) + }) + } +} + +main() diff --git a/.claude/hooks/auth-rotation-reminder/package.json b/.claude/hooks/auth-rotation-reminder/package.json new file mode 100644 index 000000000..38fa6a496 --- /dev/null +++ b/.claude/hooks/auth-rotation-reminder/package.json @@ -0,0 +1,18 @@ +{ + "name": "hook-auth-rotation-reminder", + "private": true, + "type": "module", + "main": "./index.mts", + "exports": { + ".": "./index.mts" + }, + "dependencies": { + "@socketsecurity/lib": "catalog:" + }, + "devDependencies": { + "@types/node": "catalog:" + }, + "scripts": { + "test": "node --test test/*.test.mts" + } +} diff --git a/.claude/hooks/auth-rotation-reminder/services.mts b/.claude/hooks/auth-rotation-reminder/services.mts new file mode 100644 index 000000000..f0168d091 --- /dev/null +++ b/.claude/hooks/auth-rotation-reminder/services.mts @@ -0,0 +1,142 @@ +// Service catalog for auth-rotation-reminder. +// +// Each entry tells the hook how to detect whether a CLI is currently +// authenticated and how to log it out. `optional: true` means the hook +// silently skips the service if the binary isn't on PATH (most are +// optional — most devs have a subset of these installed). +// +// Detection commands MUST exit 0 when authenticated and non-zero when +// not. Output goes to /dev/null; the hook reads only the exit code. +// +// Logout commands run unconditionally when the hook is in auto-logout +// mode. They should be idempotent — re-running them on an already +// logged-out CLI is fine. + +export interface Service { + // Stable id used in skip-list files and error messages. Never rename + // without a deprecation cycle — devs encode these in their personal + // `.skip` lists. + id: string + // Display name for output. + name: string + // Command + args that exit 0 if logged in, non-zero otherwise. + detectCmd: readonly string[] + // Command + args that performs the logout. Must be idempotent. + logoutCmd: readonly string[] + // Skip silently when the binary isn't on PATH. False means the + // hook reports "binary missing" as a finding (rare — only for + // first-class fleet CLIs we expect every dev to have). + optional: boolean + // Optional human-readable doc URL surfaced when the hook reports the + // logout. Empty when no canonical doc page exists. + docUrl?: string +} + +// Default skip-list seeds. Devs can extend via the per-user +// `~/.claude/hooks/auth-rotation/services-skip` (one id per line) +// or per-repo `.claude/auth-rotation.services-skip` files. +// +// `gh` is seeded because Claude Code itself uses `gh` for `gh pr edit` +// etc. — auto-revoking it mid-session would break the agent. +export const DEFAULT_SKIP_IDS = ['gh'] as const + +export const SERVICES: readonly Service[] = [ + { + id: 'npm', + name: 'npm', + detectCmd: ['npm', 'whoami'], + logoutCmd: ['npm', 'logout'], + optional: true, + docUrl: 'https://docs.npmjs.com/cli/v11/commands/npm-logout', + }, + { + id: 'pnpm', + name: 'pnpm', + detectCmd: ['pnpm', 'whoami'], + logoutCmd: ['pnpm', 'logout'], + optional: false, + docUrl: 'https://pnpm.io/id/11.x/cli/logout', + }, + { + id: 'yarn', + name: 'yarn', + // Yarn Berry's logout lives under `npm` namespace; Yarn Classic's + // is bare. We try Berry first (the modern default), fall back to + // Classic. Detection is the same: `npm whoami` from inside a + // yarn-managed registry. Yarn doesn't expose a portable whoami, + // so we approximate by checking for a yarn auth token in + // `~/.yarnrc.yml` via grep — too fragile to ship; use logout-only + // (idempotent: clears nothing if nothing's there). + detectCmd: ['yarn', '--version'], + logoutCmd: ['yarn', 'npm', 'logout'], + optional: true, + }, + { + id: 'gcloud', + name: 'gcloud', + // `gcloud auth list` exits 0 always; we check whether any non-empty + // active account is reported. Wrap with sh -c to chain. + detectCmd: [ + 'sh', + '-c', + 'gcloud auth list --filter=status:ACTIVE --format="value(account)" 2>/dev/null | grep -q .', + ], + logoutCmd: ['gcloud', 'auth', 'revoke', '--all', '--quiet'], + optional: true, + docUrl: 'https://cloud.google.com/sdk/gcloud/reference/auth/revoke', + }, + { + id: 'aws-sso', + name: 'aws (sso)', + // `aws sts get-caller-identity` succeeds when authenticated. + // sts is the universal probe across all AWS auth flavors. + detectCmd: ['aws', 'sts', 'get-caller-identity'], + // `aws sso logout` only clears SSO cache. For non-SSO creds, the + // dev would have to remove `~/.aws/credentials` themselves; we + // don't touch that file because it might hold long-lived keys + // intentionally. SSO-only is the conservative default. + logoutCmd: ['aws', 'sso', 'logout'], + optional: true, + }, + { + id: 'gh', + name: 'gh (GitHub CLI)', + detectCmd: ['gh', 'auth', 'status'], + logoutCmd: ['gh', 'auth', 'logout', '--hostname', 'github.com'], + optional: true, + docUrl: 'https://cli.github.com/manual/gh_auth_logout', + }, + { + id: 'vault', + name: 'vault', + detectCmd: ['vault', 'token', 'lookup'], + // `token revoke -self` revokes the active token; survives the + // logout safely (re-auth via `vault login` next session). + logoutCmd: ['vault', 'token', 'revoke', '-self'], + optional: true, + }, + { + id: 'docker', + name: 'docker', + // No portable "am I logged in" — `docker info` returns mixed data. + // Approximate via `docker system info` filter. + detectCmd: [ + 'sh', + '-c', + 'docker info 2>/dev/null | grep -q "^ Username:"', + ], + // Without a registry arg, `docker logout` clears the default index. + logoutCmd: ['docker', 'logout'], + optional: true, + }, + { + id: 'socket', + name: 'socket', + // `socket whoami` (when present in the cli) is the canonical probe. + // The cli emits exit 0 when authenticated. + detectCmd: ['socket', 'whoami'], + // `socket logout` clears the local API token from settings. + logoutCmd: ['socket', 'logout'], + optional: true, + }, +] as const diff --git a/.claude/hooks/auth-rotation-reminder/test/auth-rotation-reminder.test.mts b/.claude/hooks/auth-rotation-reminder/test/auth-rotation-reminder.test.mts new file mode 100644 index 000000000..4582a1f9a --- /dev/null +++ b/.claude/hooks/auth-rotation-reminder/test/auth-rotation-reminder.test.mts @@ -0,0 +1,162 @@ +import { spawn } from 'node:child_process' +import { existsSync, mkdirSync, mkdtempSync, rmSync, writeFileSync } from 'node:fs' +import { tmpdir } from 'node:os' +import path from 'node:path' +import { fileURLToPath } from 'node:url' +import { test } from 'node:test' +import assert from 'node:assert/strict' + +const __dirname = path.dirname(fileURLToPath(import.meta.url)) +const HOOK = path.resolve(__dirname, '..', 'index.mts') + +interface Env { + [key: string]: string +} + +function runHook(opts: { + cwd?: string + env?: Env +} = {}): Promise<{ code: number; stderr: string }> { + return new Promise((resolve, reject) => { + const child = spawn(process.execPath, [HOOK], { + cwd: opts.cwd ?? process.cwd(), + stdio: ['pipe', 'ignore', 'pipe'], + env: { + // Default to a sentinel CI value the hook short-circuits on, + // unless the caller overrides. Most tests want the early-exit + // path so they don't actually run logout commands. + ...process.env, + ...opts.env, + }, + }) + let stderr = '' + child.stderr.on('data', d => { + stderr += d.toString() + }) + child.on('error', reject) + child.on('exit', code => { + resolve({ code: code ?? -1, stderr }) + }) + child.stdin.end('{}\n') + }) +} + +function makeRepo(): string { + const dir = mkdtempSync(path.join(tmpdir(), 'auth-rotation-test-')) + mkdirSync(path.join(dir, '.claude'), { recursive: true }) + return dir +} + +test('exits 0 silently when CI env var is set', async () => { + const repo = makeRepo() + try { + const { code, stderr } = await runHook({ + cwd: repo, + env: { CI: '1' }, + }) + assert.equal(code, 0) + assert.equal(stderr, '', `expected no output in CI; got: ${stderr}`) + } finally { + rmSync(repo, { recursive: true, force: true }) + } +}) + +test('exits 0 silently when SOCKET_AUTH_ROTATION_DISABLED is set', async () => { + const repo = makeRepo() + try { + const { code, stderr } = await runHook({ + cwd: repo, + env: { + CI: '', + SOCKET_AUTH_ROTATION_DISABLED: '1', + }, + }) + assert.equal(code, 0) + assert.equal(stderr, '') + } finally { + rmSync(repo, { recursive: true, force: true }) + } +}) + +test('honors a project-local snooze with future expiry', async () => { + const repo = makeRepo() + try { + const expiry = new Date(Date.now() + 60 * 60 * 1000).toISOString() + writeFileSync(path.join(repo, '.claude', 'auth-rotation.snooze'), expiry) + const { code, stderr } = await runHook({ + cwd: repo, + env: { CI: '' }, + }) + assert.equal(code, 0) + // Hook should NOT report cleanup of an unexpired snooze. + assert.ok( + !stderr.includes('cleared expired snooze'), + `hook cleared a fresh snooze: ${stderr}`, + ) + } finally { + rmSync(repo, { recursive: true, force: true }) + } +}) + +test('auto-cleans expired project-local snooze and proceeds', async () => { + const repo = makeRepo() + const snoozeFile = path.join(repo, '.claude', 'auth-rotation.snooze') + try { + const expiry = new Date(Date.now() - 60 * 60 * 1000).toISOString() + writeFileSync(snoozeFile, expiry) + const { code, stderr } = await runHook({ + cwd: repo, + // Force CI so the hook short-circuits AFTER snooze handling + // (which is what we're testing). + env: { CI: '' }, + }) + assert.equal(code, 0) + // We can't easily assert on snooze cleanup messaging without + // also forcing the hook to do real auth detection. The strong + // assertion is that the file is gone afterward. + assert.ok( + !existsSync(snoozeFile), + 'expired snooze file should have been deleted', + ) + } finally { + rmSync(repo, { recursive: true, force: true }) + } +}) + +test('auto-cleans malformed snooze content', async () => { + const repo = makeRepo() + const snoozeFile = path.join(repo, '.claude', 'auth-rotation.snooze') + try { + writeFileSync(snoozeFile, 'not-an-iso-timestamp\n') + const { code } = await runHook({ + cwd: repo, + env: { CI: '' }, + }) + assert.equal(code, 0) + assert.ok( + !existsSync(snoozeFile), + 'malformed snooze file should have been deleted', + ) + } finally { + rmSync(repo, { recursive: true, force: true }) + } +}) + +test('auto-cleans empty (legacy) snooze file', async () => { + const repo = makeRepo() + const snoozeFile = path.join(repo, '.claude', 'auth-rotation.snooze') + try { + writeFileSync(snoozeFile, '') + const { code } = await runHook({ + cwd: repo, + env: { CI: '' }, + }) + assert.equal(code, 0) + assert.ok( + !existsSync(snoozeFile), + 'empty (legacy) snooze file should have been deleted', + ) + } finally { + rmSync(repo, { recursive: true, force: true }) + } +}) diff --git a/.claude/hooks/token-hygiene/tsconfig.json b/.claude/hooks/auth-rotation-reminder/tsconfig.json similarity index 100% rename from .claude/hooks/token-hygiene/tsconfig.json rename to .claude/hooks/auth-rotation-reminder/tsconfig.json diff --git a/.claude/hooks/check-new-deps/README.md b/.claude/hooks/check-new-deps/README.md index 25fb11285..5be7f3a68 100644 --- a/.claude/hooks/check-new-deps/README.md +++ b/.claude/hooks/check-new-deps/README.md @@ -8,9 +8,10 @@ When Claude edits a file like `package.json`, `requirements.txt`, `Cargo.toml`, 1. **Detects the file type** and extracts dependency names from the content 2. **Diffs against the old content** (for edits) so only *newly added* deps are checked -3. **Queries the Socket.dev API** to check for malware -4. **Blocks the edit** (exit code 2) if malware is detected -5. **Allows** (exit code 0) if everything is clean or the file isn't a manifest +3. **Queries the Socket.dev API** to check for malware and critical security alerts +4. **Blocks the edit** (exit code 2) if malware or critical alerts are found +5. **Warns** (but allows) if a package has a low quality score +6. **Allows** (exit code 0) if everything is clean or the file isn't a manifest ## How it works @@ -29,8 +30,11 @@ Build Package URLs (PURLs) for each dep │ ▼ Call sdk.checkMalware(components) + - ≤5 deps: parallel firewall API (fast, full data) + - >5 deps: batch PURL API (efficient) │ - ├── Malware detected → EXIT 2 (blocked) + ├── Malware/critical alert → EXIT 2 (blocked) + ├── Low score → warn, EXIT 0 (allowed) └── Clean → EXIT 0 (allowed) ``` diff --git a/.claude/hooks/check-new-deps/index.mts b/.claude/hooks/check-new-deps/index.mts index b143ecbb1..af0c5c3d6 100644 --- a/.claude/hooks/check-new-deps/index.mts +++ b/.claude/hooks/check-new-deps/index.mts @@ -28,6 +28,7 @@ import type { PackageURL } from '@socketregistry/packageurl-js' import { SOCKET_PUBLIC_API_TOKEN, } from '@socketsecurity/lib/constants/socket' +import { errorMessage } from '@socketsecurity/lib/errors' import { getDefaultLogger } from '@socketsecurity/lib/logger' import { normalizePath, @@ -35,12 +36,6 @@ import { import { SocketSdk } from '@socketsecurity/sdk' import type { MalwareCheckPackage } from '@socketsecurity/sdk' -// Hook runs standalone with only @socketsecurity/* deps, so this -// one-liner lives here instead of importing a shared helper. -function errorMessage(error: unknown): string { - return error instanceof Error ? error.message : String(error) -} - const logger = getDefaultLogger() // Per-request timeout (ms) to avoid blocking the hook on slow responses. @@ -159,23 +154,46 @@ const extractors: Record = { (m): Dep => ({ type: 'cargo', name: m[1] }) ), 'Cargo.toml': (content: string): Dep[] => { - // Rust: only extract from [dependencies], [dev-dependencies], [build-dependencies] sections. - // Skip [package], [lib], [bin], [workspace], [profile] metadata sections. + // Rust: extract crate names from dep lines. + // + // Two-mode strategy because the hook receives either a full + // Cargo.toml (Write) or a fragment (Edit's new_string, often just + // the added line with no section header): + // + // Full file — scan only [dependencies] / [dev-dependencies] / + // [build-dependencies] (incl. target-specific + // [target.*.dependencies] via the `.` suffix) + // and skip [package], [features], [profile], etc. + // Fragment — no section headers at all → treat the whole + // content as an implicit [dependencies] body and + // match any `name = "..."` or `name = { version = "..." }`. + // + // The lineRe requires the value to look like a version spec + // (string or table with a `version` key), so `[features]`-style + // `key = ["derive"]` array values don't match even in fragment mode. const deps: Dep[] = [] - const depSectionRe = /^\[(?:(?:dev-|build-)?dependencies(?:\.[^\]]+)?)\]\s*$/gm + const depSectionRe = /^\[(?:(?:dev-|build-)?dependencies(?:\.[^\]]+)?|target\.[^\]]+\.(?:dev-|build-)?dependencies(?:\.[^\]]+)?)\]\s*$/gm const anySectionRe = /^\[/gm + const lineRe = /^(\w[\w-]*)\s*=\s*(?:\{[^}]*version\s*=\s*"[^"]*"|\s*"[^"]*")/gm + const push = (section: string) => { + let m + while ((m = lineRe.exec(section)) !== null) { + deps.push({ type: 'cargo', name: m[1] }) + } + lineRe.lastIndex = 0 + } + const hasAnySection = /^\[/m.test(content) + if (!hasAnySection) { + push(content) + return deps + } let sectionMatch while ((sectionMatch = depSectionRe.exec(content)) !== null) { const sectionStart = sectionMatch.index + sectionMatch[0].length anySectionRe.lastIndex = sectionStart const nextSection = anySectionRe.exec(content) const sectionEnd = nextSection ? nextSection.index : content.length - const sectionText = content.slice(sectionStart, sectionEnd) - const lineRe = /^(\w[\w-]*)\s*=\s*(?:\{[^}]*version\s*=\s*"[^"]*"|\s*"[^"]*")/gm - let m - while ((m = lineRe.exec(sectionText)) !== null) { - deps.push({ type: 'cargo', name: m[1] }) - } + push(content.slice(sectionStart, sectionEnd)) } return deps }, @@ -280,21 +298,6 @@ const extractors: Record = { 'yarn.lock': extractNpmLockfile, } -// --- main (only when executed directly, not imported) --- - -if (fileURLToPath(import.meta.url) === path.resolve(process.argv[1])) { - // Read the full JSON blob from stdin (piped by Claude Code). - let input = '' - for await (const chunk of process.stdin) input += chunk - const hook: HookInput = JSON.parse(input) - - if (hook.tool_name !== 'Edit' && hook.tool_name !== 'Write') { - process.exitCode = 0 - } else { - process.exitCode = await check(hook) - } -} - // --- core --- // Orchestrates the full check: extract deps, diff against old, query API. @@ -728,3 +731,26 @@ export { extractTerraform, findExtractor, } + +// --- main (only when executed directly, not imported) --- +// +// Kept at the bottom because the module uses top-level await +// (`for await (const chunk of process.stdin)`) to read the hook payload. +// Top-level await suspends module evaluation at the suspension point, so +// any `const` declared AFTER the suspending block is still in the TDZ +// when the awaited work calls back into the module (e.g. extractNpm → +// PACKAGE_JSON_METADATA_KEYS). Placing main last guarantees every +// module-level declaration is initialized before main runs. + +if (fileURLToPath(import.meta.url) === path.resolve(process.argv[1])) { + // Read the full JSON blob from stdin (piped by Claude Code). + let input = '' + for await (const chunk of process.stdin) input += chunk + const hook: HookInput = JSON.parse(input) + + if (hook.tool_name !== 'Edit' && hook.tool_name !== 'Write') { + process.exitCode = 0 + } else { + process.exitCode = await check(hook) + } +} diff --git a/.claude/hooks/check-new-deps/package.json b/.claude/hooks/check-new-deps/package.json index 96d04649b..11f3f9a79 100644 --- a/.claude/hooks/check-new-deps/package.json +++ b/.claude/hooks/check-new-deps/package.json @@ -1,5 +1,5 @@ { - "name": "@socketsecurity/hook-check-new-deps", + "name": "hook-check-new-deps", "private": true, "type": "module", "main": "./index.mts", diff --git a/.claude/hooks/logger-guard/README.md b/.claude/hooks/logger-guard/README.md new file mode 100644 index 000000000..55ba3ece7 --- /dev/null +++ b/.claude/hooks/logger-guard/README.md @@ -0,0 +1,54 @@ +# logger-guard + +Claude Code `PreToolUse` hook that blocks `Edit`/`Write` tool calls +introducing direct stream writes (`process.stderr.write`, +`process.stdout.write`, `console.log/error/warn/info/debug`) into +source files. + +## Why + +Source code uses `getDefaultLogger()` from `@socketsecurity/lib/logger` +for all output. Direct stream writes bypass: + +- Color/theme handling +- Indentation tracking +- Stream redirection in tests +- Counter increments used by spinners and progress bars + +so they produce inconsistent output that breaks layout-sensitive +workflows (spinner clears, footer rendering). + +## Scope + +- Only fires on `Edit` / `Write` tools. +- Only inspects files matching `*.{ts,mts,tsx,cts}` under repo + source. Hooks (`.claude/hooks/`), git-hooks (`.git-hooks/`), build + scripts (`scripts/`), tests, fixtures, and external/vendored code + are exempt. +- Lines containing `# socket-hook: allow logger` are exempt + (canonical opt-out). The bare `# socket-hook: allow` form also + works. +- Lines that look like documentation (`*` / `//` / `#` comments, + JSDoc tags, fully-backticked code spans) are exempt. + +## Suggested replacements + +| Direct call | Logger equivalent | +| --- | --- | +| `process.stderr.write(s)` | `logger.error(s)` | +| `process.stdout.write(s)` | `logger.info(s)` | +| `console.error(...)` | `logger.error(...)` | +| `console.warn(...)` | `logger.warn(...)` | +| `console.info(...)` | `logger.info(...)` | +| `console.debug(...)` | `logger.debug(...)` | +| `console.log(...)` | `logger.info(...)` | + +The hook surfaces the rewrite as a `Fix:` line per hit so the agent +can apply it directly. + +## Tests + +```bash +cd .claude/hooks/logger-guard +node --test test/*.test.mts +``` diff --git a/.claude/hooks/logger-guard/index.mts b/.claude/hooks/logger-guard/index.mts new file mode 100644 index 000000000..f7ad0f853 --- /dev/null +++ b/.claude/hooks/logger-guard/index.mts @@ -0,0 +1,261 @@ +#!/usr/bin/env node +// Claude Code PreToolUse hook — logger-guard. +// +// Blocks Edit/Write tool calls that would introduce direct calls to +// `process.stderr.write`, `process.stdout.write`, `console.log`, +// `console.error`, `console.warn`, `console.info`, or `console.debug` +// in source files. Exit code 2 makes Claude Code refuse the tool call +// so the diff never lands. The model sees the rejection reason on +// stderr and retries using the lib's logger. +// +// Why this rule: +// +// The fleet's source code uses `getDefaultLogger()` from +// `@socketsecurity/lib/logger` for every output. Direct stream writes +// bypass: +// - Color/theme handling +// - Indentation tracking +// - Stream redirection in tests +// - Counter increments used by spinners +// so they produce inconsistent output that breaks layout-sensitive +// workflows (spinner clears, footer rendering). +// +// Scope: +// +// - Fires only on `Edit` and `Write` tool calls. +// - Only inspects files under `src/` with .ts/.mts/.tsx/.cts +// extensions. Hooks (.claude/hooks/), git-hooks (.git-hooks/), +// scripts (scripts/), tests, fixtures, and external/ vendored code +// are exempt — see EXEMPT_PATH_PATTERNS. +// - Lines marked `# socket-hook: allow logger` are exempt (canonical +// opt-out marker, same as path-guard / token-guard / npx-guard). +// - Lines that look like documentation (comment lines, JSDoc tags, +// fully backticked code spans) are exempt — handled by the shared +// `looksLikeDocumentation` heuristic in `_helpers.mts`. +// +// The hook fails OPEN on its own bugs (exit 0 + stderr log) so a bad +// hook deploy can't brick the session. + +import process from 'node:process' + +// Files exempt from the rule. Comments explain why each is excluded. +const EXEMPT_PATH_PATTERNS: RegExp[] = [ + // Hook code itself runs early in the lifecycle and may need to log + // to stderr before the lib is fully resolvable. Treat hooks as + // "system code" with their own conventions. + /\.claude\/hooks\//, + // Git hooks (.git-hooks/_helpers.mts, pre-commit, etc.) run before + // workspace deps are guaranteed to be installed. + /\.git-hooks\//, + // Build scripts often produce direct stdout for human-readable + // build output (progress, summary). Migrate these case-by-case + // outside of this hook's scope. + /(^|\/)scripts\//, + // Test files commonly use console.* to capture / assert output. + /\.(test|spec)\.(m?[jt]s|tsx?|cts|mts)$/, + /(^|\/)tests?\//, + /(^|\/)fixtures\//, + // Vendored upstream sources — never modified for local conventions. + /(^|\/)external\//, + /(^|\/)vendor\//, + /(^|\/)upstream\//, + // The hook itself. + /\.claude\/hooks\/logger-guard\//, +] + +const LOGGER_LEAK_RE = + /\b(process\.std(?:err|out)\.write|console\.(?:log|error|warn|info|debug))\s*\(/ + +const COMMENT_LINE_RE = /^\s*(\*|\/\/|#)/ +const JSDOC_TAG_RE = /@(example|param|returns?|see|link)\b/ +const SOCKET_HOOK_MARKER_RE = /#\s*socket-hook:\s*allow(?:\s+([\w-]+))?/ + +function isMarkerSuppressed(line: string): boolean { + const m = line.match(SOCKET_HOOK_MARKER_RE) + if (!m) { + return false + } + // No specific rule named → blanket allow. Targeted form must name + // 'logger' to suppress this scanner. + return !m[1] || m[1] === 'logger' +} + +function isInsideBackticks(line: string): boolean { + // Find every backtick-delimited span on the line and test if every + // logger-leak match sits within one. Conservative: any match outside + // a backtick span fails the check. + const spans: Array<[number, number]> = [] + for (let i = 0; i < line.length; i += 1) { + if (line[i] === '`') { + const end = line.indexOf('`', i + 1) + if (end < 0) { + break + } + spans.push([i, end]) + i = end + } + } + if (spans.length === 0) { + return false + } + const re = new RegExp(LOGGER_LEAK_RE.source, 'g') + let m: RegExpExecArray | null + while ((m = re.exec(line)) !== null) { + const start = m.index + const end = start + m[0].length + const inside = spans.some(([s, e]) => start > s && end <= e) + if (!inside) { + return false + } + } + return true +} + +function looksLikeDocumentation(line: string): boolean { + if (isMarkerSuppressed(line)) { + return true + } + if (COMMENT_LINE_RE.test(line)) { + return true + } + if (JSDOC_TAG_RE.test(line)) { + return true + } + if (isInsideBackticks(line)) { + return true + } + return false +} + +function suggestReplacement(line: string): string { + return line + .replace(/\bprocess\.stderr\.write\s*\(/g, 'logger.error(') + .replace(/\bprocess\.stdout\.write\s*\(/g, 'logger.info(') + .replace(/\bconsole\.error\s*\(/g, 'logger.error(') + .replace(/\bconsole\.warn\s*\(/g, 'logger.warn(') + .replace(/\bconsole\.info\s*\(/g, 'logger.info(') + .replace(/\bconsole\.debug\s*\(/g, 'logger.debug(') + .replace(/\bconsole\.log\s*\(/g, 'logger.info(') +} + +interface Hit { + lineNumber: number + line: string + suggested: string +} + +function scan(source: string): Hit[] { + const hits: Hit[] = [] + const lines = source.split('\n') + for (let i = 0; i < lines.length; i += 1) { + const line = lines[i]! + if (!LOGGER_LEAK_RE.test(line)) { + continue + } + if (looksLikeDocumentation(line)) { + continue + } + hits.push({ + lineNumber: i + 1, + line, + suggested: suggestReplacement(line), + }) + } + return hits +} + +function isInScope(filePath: string): boolean { + if (!filePath) { + return false + } + if (!/\.(m?ts|tsx|cts)$/.test(filePath)) { + return false + } + for (const re of EXEMPT_PATH_PATTERNS) { + if (re.test(filePath)) { + return false + } + } + return true +} + +function readStdin(): Promise { + return new Promise(resolve => { + let buf = '' + process.stdin.setEncoding('utf8') + process.stdin.on('data', chunk => (buf += chunk)) + process.stdin.on('end', () => resolve(buf)) + }) +} + +interface ToolInput { + tool_name?: string + tool_input?: { + file_path?: string + new_string?: string + content?: string + } +} + +function emitBlock(filePath: string, hits: Hit[]): void { + // Hook itself logs to stderr (no lib import at module load — keep + // hooks self-contained for fast startup). The rule only applies to + // source code; this output is informational for the agent. + const out: string[] = [] + out.push('') + out.push('[logger-guard] Blocked: direct stream write found') + out.push( + ' Use `getDefaultLogger()` from `@socketsecurity/lib/logger` instead.', + ) + out.push(` File: ${filePath}`) + for (const h of hits.slice(0, 3)) { + out.push(` Line ${h.lineNumber}: ${h.line.trim()}`) + out.push(` Fix: ${h.suggested.trim()}`) + } + if (hits.length > 3) { + out.push(` …and ${hits.length - 3} more.`) + } + out.push( + ' Opt-out for one line (rare): append `// # socket-hook: allow logger`.', + ) + out.push('') + process.stderr.write(out.join('\n')) +} + +async function main(): Promise { + const raw = await readStdin() + if (!raw) { + return + } + let payload: ToolInput + try { + payload = JSON.parse(raw) as ToolInput + } catch { + return + } + if (payload.tool_name !== 'Edit' && payload.tool_name !== 'Write') { + return + } + const filePath = payload.tool_input?.file_path ?? '' + if (!isInScope(filePath)) { + return + } + const source = + payload.tool_input?.new_string ?? payload.tool_input?.content ?? '' + if (!source) { + return + } + const hits = scan(source) + if (hits.length === 0) { + return + } + emitBlock(filePath, hits) + process.exitCode = 2 +} + +main().catch(e => { + // Fail open on hook bugs. + process.stderr.write( + `[logger-guard] hook error (continuing): ${(e as Error).message}\n`, + ) +}) diff --git a/.claude/hooks/token-hygiene/package.json b/.claude/hooks/logger-guard/package.json similarity index 63% rename from .claude/hooks/token-hygiene/package.json rename to .claude/hooks/logger-guard/package.json index 7f282cd63..7bc467808 100644 --- a/.claude/hooks/token-hygiene/package.json +++ b/.claude/hooks/logger-guard/package.json @@ -1,16 +1,15 @@ { - "name": "@socketsecurity/hook-token-hygiene", + "name": "hook-logger-guard", "private": true, "type": "module", "main": "./index.mts", "exports": { ".": "./index.mts" }, + "devDependencies": { + "@types/node": "catalog:" + }, "scripts": { "test": "node --test test/*.test.mts" - }, - "devDependencies": { - "@socketsecurity/lib": "catalog:", - "@types/node": "24.9.2" } } diff --git a/.claude/hooks/logger-guard/test/logger-guard.test.mts b/.claude/hooks/logger-guard/test/logger-guard.test.mts new file mode 100644 index 000000000..a509b365f --- /dev/null +++ b/.claude/hooks/logger-guard/test/logger-guard.test.mts @@ -0,0 +1,171 @@ +import { spawn } from 'node:child_process' +import path from 'node:path' +import { fileURLToPath } from 'node:url' +import { test } from 'node:test' +import assert from 'node:assert/strict' + +const __dirname = path.dirname(fileURLToPath(import.meta.url)) +const HOOK = path.resolve(__dirname, '..', 'index.mts') + +interface Payload { + tool_name: 'Edit' | 'Write' | string + tool_input: { + file_path?: string + new_string?: string + content?: string + } +} + +function runHook(payload: Payload): Promise<{ code: number; stderr: string }> { + return new Promise((resolve, reject) => { + const child = spawn(process.execPath, [HOOK], { + stdio: ['pipe', 'ignore', 'pipe'], + }) + let stderr = '' + child.stderr.on('data', d => { + stderr += d.toString() + }) + child.on('error', reject) + child.on('exit', code => { + resolve({ code: code ?? -1, stderr }) + }) + child.stdin.end(JSON.stringify(payload)) + }) +} + +test('blocks console.log in src/ .ts files', async () => { + const { code, stderr } = await runHook({ + tool_name: 'Write', + tool_input: { + file_path: 'src/foo.ts', + content: 'export function foo() { console.log("hi") }', + }, + }) + assert.equal(code, 2, `expected exit 2; got ${code}; stderr=${stderr}`) + assert.ok(stderr.includes('logger-guard')) + assert.ok(stderr.includes('Fix:')) + assert.ok(stderr.includes('logger.info')) +}) + +test('blocks process.stderr.write in src/ .mts files', async () => { + const { code, stderr } = await runHook({ + tool_name: 'Edit', + tool_input: { + file_path: 'src/utils/output.mts', + new_string: 'process.stderr.write("oops\\n")', + }, + }) + assert.equal(code, 2) + assert.ok(stderr.includes('logger.error(')) +}) + +test('allows hooks themselves to use process.stderr.write', async () => { + const { code, stderr } = await runHook({ + tool_name: 'Edit', + tool_input: { + file_path: '.claude/hooks/some-hook/index.mts', + new_string: 'process.stderr.write("ok\\n")', + }, + }) + assert.equal(code, 0, `expected exit 0; got ${code}; stderr=${stderr}`) +}) + +test('allows scripts/ to use console.log', async () => { + const { code } = await runHook({ + tool_name: 'Edit', + tool_input: { + file_path: 'scripts/build.mts', + new_string: 'console.log("build complete")', + }, + }) + assert.equal(code, 0) +}) + +test('allows tests to use console.log', async () => { + const { code } = await runHook({ + tool_name: 'Edit', + tool_input: { + file_path: 'src/utils/foo.test.mts', + new_string: 'console.log("debug")', + }, + }) + assert.equal(code, 0) +}) + +test('respects # socket-hook: allow logger marker', async () => { + const { code } = await runHook({ + tool_name: 'Edit', + tool_input: { + file_path: 'src/foo.ts', + new_string: + 'const x = 1; console.error("legacy") // # socket-hook: allow logger', + }, + }) + assert.equal(code, 0) +}) + +test('respects bare # socket-hook: allow marker', async () => { + const { code } = await runHook({ + tool_name: 'Edit', + tool_input: { + file_path: 'src/foo.ts', + new_string: 'console.warn("a") // # socket-hook: allow', + }, + }) + assert.equal(code, 0) +}) + +test('does not flag JSDoc examples', async () => { + const { code } = await runHook({ + tool_name: 'Write', + tool_input: { + file_path: 'src/foo.ts', + content: + '/**\n * @example\n * console.log("usage")\n */\nexport const foo = 1', + }, + }) + assert.equal(code, 0) +}) + +test('does not flag comment lines', async () => { + const { code } = await runHook({ + tool_name: 'Edit', + tool_input: { + file_path: 'src/foo.ts', + new_string: '// previously: console.log("debug")', + }, + }) + assert.equal(code, 0) +}) + +test('does not flag content fully inside a single backtick span', async () => { + const { code } = await runHook({ + tool_name: 'Edit', + tool_input: { + file_path: 'src/foo.ts', + // Single-line markdown-style backtick span — the inner content + // is documentation, not real code. + new_string: 'const note = `use logger.info() not console.log()`', + }, + }) + assert.equal(code, 0) +}) + +test('does not run on non-Edit/Write tools', async () => { + const { code } = await runHook({ + tool_name: 'Bash', + tool_input: { content: 'console.log("nope")' }, + }) + assert.equal(code, 0) +}) + +test('does not run on .js files (out of scope)', async () => { + const { code } = await runHook({ + tool_name: 'Edit', + tool_input: { + file_path: 'src/foo.js', + new_string: 'console.log("legacy")', + }, + }) + assert.equal(code, 0) +}) diff --git a/.claude/hooks/logger-guard/tsconfig.json b/.claude/hooks/logger-guard/tsconfig.json new file mode 100644 index 000000000..53c5c8475 --- /dev/null +++ b/.claude/hooks/logger-guard/tsconfig.json @@ -0,0 +1,15 @@ +{ + "compilerOptions": { + "declarationMap": false, + "erasableSyntaxOnly": true, + "module": "nodenext", + "moduleResolution": "nodenext", + "noEmit": true, + "rewriteRelativeImportExtensions": true, + "skipLibCheck": true, + "sourceMap": false, + "strict": true, + "target": "esnext", + "verbatimModuleSyntax": true + } +} diff --git a/.claude/hooks/path-guard/README.md b/.claude/hooks/path-guard/README.md new file mode 100644 index 000000000..2dee9c196 --- /dev/null +++ b/.claude/hooks/path-guard/README.md @@ -0,0 +1,66 @@ +# path-guard + +Claude Code `PreToolUse` hook that refuses `Edit`/`Write` tool calls that would *construct* a multi-segment build/output path inline in a `.mts` or `.cts` file. Mandatory across the Socket fleet — every repo ships this file byte-for-byte via `scripts/sync-scaffolding.mts`. + +**Mantra: 1 path, 1 reference.** + +Construct a path *once* in the canonical `paths.mts` (or a build-infra helper); reference the computed value everywhere else. + +## What it blocks + +| Rule | Example | Fix | +|------|---------|-----| +| **A** — Multi-stage path constructed inline | `path.join(PKG, 'build', mode, 'out', 'Final', name)` | Construct in the package's `scripts/paths.mts` (or use `getFinalBinaryPath` from `build-infra/lib/paths`); import the computed value here | +| **B** — Cross-package path traversal | `path.join(PKG, '..', 'lief-builder', 'build', ...)` | Add `lief-builder: workspace:*` as a dep; import its `paths.mts` via the workspace `exports` field | + +The hook fires on `Edit` and `Write` tool calls when the target path ends in `.mts` or `.cts`. Other extensions (`.ts`, `.mjs`, `.js`, `.yml`, `.json`, `.md`) pass through — TS path code lives in `.mts` per CLAUDE.md, and other file types are covered by the `scripts/check-paths.mts` gate at commit time. + +## What it allows + +- Edits to a `paths.mts` (canonical constructor — every package's source of truth). +- Edits to `scripts/check-paths.mts` (the gate, which legitimately enumerates patterns). +- Edits to this hook's own files (the test suite has to enumerate the same patterns). +- Edits to `scripts/check-consistency.mts` (existing path-scanning gate). +- `path.join` calls with a single stage segment (e.g. `path.join(packageRoot, 'build', 'temp')`) — that's a one-off helper path, not a multi-stage build output. +- `path.join` calls with no stage segments at all (most general-purpose joins). +- Any string concatenation that doesn't go through `path.join` — the hook is regex-based and intentionally narrow; the gate runs a deeper scan at commit time. + +## Stage segments the hook recognizes + +These come from `build-infra/lib/constants.mts` `BUILD_STAGES` plus the lowercase directory-name siblings used by some builders: + +`Final`, `Release`, `Stripped`, `Compressed`, `Optimized`, `Synced`, `wasm`, `downloaded` + +Two or more in the same `path.join` call (or one stage + one of `'build'`/`'out'` + one mode `'dev'`/`'prod'`) triggers Rule A. + +## Known sibling packages (for Rule B) + +The hook recognizes Rule B traversals only when the next segment after `..` is a known fleet package name: + +`binflate`, `binject`, `binpress`, `bin-infra`, `build-infra`, `codet5-models-builder`, `curl-builder`, `iocraft-builder`, `ink-builder`, `libpq-builder`, `lief-builder`, `minilm-builder`, `models`, `napi-go`, `node-smol-builder`, `onnxruntime-builder`, `opentui-builder`, `stubs-builder`, `ultraviolet-builder`, `yoga-layout-builder` + +When a new package joins the workspace, add it here. + +## Control flow + +The hook reads the tool-use payload from stdin, type-checks `tool_name === 'Edit'` or `'Write'`, filters to `.mts`/`.cts` files, and runs `check(source)`. Any rule violation `throw`s a typed `BlockError`; a single top-level `try/catch` in `main()` writes the block message to stderr and sets `process.exitCode = 2`. + +Hook bugs fail **open** — a crash in the hook writes a log line and returns exit 0 so legitimate work isn't blocked on a bad deploy. The companion `scripts/check-paths.mts` gate runs a thorough whole-repo scan at `pnpm check` time, catching anything the hook misses. + +## Testing + +```bash +pnpm --filter hook-path-guard test +``` + +Adding a new detection pattern: update `STAGE_SEGMENTS` (or `KNOWN_SIBLING_PACKAGES`) in `index.mts`, add a positive and negative test in `test/path-guard.test.mts`. + +## Updating across the fleet + +This file is in `IDENTICAL_FILES` in `scripts/sync-scaffolding.mts` (in `socket-repo-template`). After editing, run from `socket-repo-template`: + +```bash +node scripts/sync-scaffolding.mts --all --fix +``` + +to propagate the change to every fleet repo. diff --git a/.claude/hooks/path-guard/index.mts b/.claude/hooks/path-guard/index.mts new file mode 100644 index 000000000..ced9fcfc1 --- /dev/null +++ b/.claude/hooks/path-guard/index.mts @@ -0,0 +1,339 @@ +#!/usr/bin/env node +// Claude Code PreToolUse hook — path-guard firewall. +// +// Mantra: 1 path, 1 reference. +// +// Blocks Edit/Write tool calls that would *construct* a multi-segment +// build/output path inline in a `.mts` or `.cts` file, instead of +// importing the constructed value from the canonical `paths.mts` (or a +// build-infra helper). This fires BEFORE the write lands; exit code 2 +// makes Claude Code refuse the tool call so the diff never touches the +// repo. The model sees the rejection reason on stderr and retries with +// an import-based approach. +// +// What the hook checks (subset of the gate's rules — diff-local only): +// +// Rule A — Multi-stage path construction: a `path.join(...)` call or +// string-template that stitches together two or more "stage" segments +// like `'Final'`, `'Release'`, `'Stripped'`, `'Compressed'`, +// `'Optimized'`, `'Synced'`, `'wasm'`, `'downloaded'` together with +// `'build'` / `'out'` / a mode (`'dev'`/`'prod'`) or platform-arch. +// Outside a `paths.mts` file, this is always a violation: the +// construction belongs in a helper, every consumer imports the +// computed value. +// +// Rule B — Cross-package traversal: `path.join(*, '..', '', 'build', ...)` reaches into a sibling's build output +// without going through its `exports`. Forces consumers to declare a +// workspace dep and import the sibling's `paths.mts`. The R28 yoga/ +// ink bug — ink hand-building yoga's wasm path and missing the +// `wasm/` segment — is exactly the failure mode this prevents. +// +// What the hook does NOT check (the gate handles repo-wide concerns): +// +// Rule C — workflow YAML repetition (gate scans .yml files). +// Rule D — comment-encoded paths (gate scans comments + JSDoc). +// Rule F — same path reconstructed in multiple files (needs whole- +// repo state). +// Rule G — Makefile / Dockerfile / shell-script paths (different +// tool, gate covers). +// +// Scope: +// +// - Fires only on `Edit` and `Write` tool calls. +// - Skips files NOT ending in `.mts` or `.cts`. TS path code lives +// there; .ts/.mjs/.js sources in `additions/` have different +// constraints per CLAUDE.md. +// - Skips when the target itself is a `paths.mts` (canonical +// constructor), the gate (`scripts/check-paths.mts`), or this hook +// — those files legitimately enumerate stage segments. +// +// Control flow uses a `BlockError` thrown from check helpers so every +// short-circuit path goes through a single `process.exitCode = 2` drop +// at the top-level catch — no scattered `process.exit(2)` that can race +// with buffered stderr. The hook fails OPEN on its own bugs (exit 0 + +// log) so a bad deploy of the hook can't brick the session. + +import process from 'node:process' + +import { + BUILD_ROOT_SEGMENTS, + KNOWN_SIBLING_PACKAGES, + MODE_SEGMENTS, + STAGE_SEGMENTS, +} from './segments.mts' + +// File-path patterns that are exempt from the hook entirely. Edits to +// these files legitimately need to enumerate path segments. +const EXEMPT_FILE_PATTERNS: RegExp[] = [ + // Any paths.mts is the canonical constructor. + /(^|\/)paths\.(mts|cts)$/, + // The gate itself and this hook — both enumerate the patterns to + // detect them. + /scripts\/check-paths\.mts$/, + /\.claude\/hooks\/path-guard\/index\.(mts|cts)$/, + /\.claude\/hooks\/path-guard\/test\//, + // Existing path-scanning gates that intentionally enumerate. + /scripts\/check-consistency\.mts$/, +] + +class BlockError extends Error { + public readonly rule: string + public readonly suggestion: string + public readonly snippet: string + constructor(rule: string, suggestion: string, snippet: string) { + super(rule) + this.name = 'BlockError' + this.rule = rule + this.suggestion = suggestion + this.snippet = snippet.slice(0, 240) + (snippet.length > 240 ? '…' : '') + } +} + +const stdin = (): Promise => + new Promise(resolve => { + let buf = '' + process.stdin.setEncoding('utf8') + process.stdin.on('data', chunk => (buf += chunk)) + process.stdin.on('end', () => resolve(buf)) + }) + +type ToolInput = { + tool_name?: string + tool_input?: { + file_path?: string + new_string?: string + content?: string + } +} + +const isInScope = (filePath: string): boolean => { + if (!filePath) { + return false + } + // Only inspect TypeScript-Module / CommonJS-Module sources. Per + // the user's directive, allowlist by extension. + if (!filePath.endsWith('.mts') && !filePath.endsWith('.cts')) { + return false + } + return !EXEMPT_FILE_PATTERNS.some(re => re.test(filePath)) +} + +// Extract every `path.join(...)` and `path.resolve(...)` call from +// the diff and return its argument substring. Uses paren-balancing so +// deeply nested arguments like `path.join(getDir(child(x)), 'Final')` +// are captured correctly — a regex-only approach silently missed any +// argument with 2+ levels of nested parentheses. +const extractPathCalls = ( + source: string, +): Array<{ snippet: string; literals: string[] }> => { + const calls: Array<{ snippet: string; literals: string[] }> = [] + const callRe = /\bpath\.(?:join|resolve)\s*\(/g + let m: RegExpExecArray | null + while ((m = callRe.exec(source)) !== null) { + const callStart = m.index + const argsStart = callRe.lastIndex + let depth = 1 + let i = argsStart + let inString: '"' | "'" | '`' | null = null + while (i < source.length && depth > 0) { + const ch = source[i]! + if (inString) { + if (ch === '\\') { + i += 2 + continue + } + if (ch === inString) { + inString = null + } + } else { + if (ch === '"' || ch === "'" || ch === '`') { + inString = ch + } else if (ch === '(') { + depth += 1 + } else if (ch === ')') { + depth -= 1 + if (depth === 0) { + break + } + } + } + i += 1 + } + if (depth !== 0) { + continue + } + const args = source.slice(argsStart, i) + const litRe = /(['"])((?:\\.|(?!\1)[^\\])*)\1/g + const literals: string[] = [] + let lit: RegExpExecArray | null + while ((lit = litRe.exec(args)) !== null) { + const value = lit[2] + if (value !== undefined) { + literals.push(value) + } + } + calls.push({ snippet: source.slice(callStart, i + 1), literals }) + callRe.lastIndex = i + 1 + } + return calls +} + +const checkRuleA = (calls: ReturnType): void => { + for (const call of calls) { + const stages = call.literals.filter(l => STAGE_SEGMENTS.has(l)) + const buildRoots = call.literals.filter(l => BUILD_ROOT_SEGMENTS.has(l)) + const modes = call.literals.filter(l => MODE_SEGMENTS.has(l)) + // Trigger if: 2+ stage segments OR (1 stage + 1 build-root + 1 mode). + // Both shapes indicate a hand-built build-output path. + const twoStages = stages.length >= 2 + const stagePlusContext = + stages.length >= 1 && buildRoots.length >= 1 && modes.length >= 1 + if (twoStages || stagePlusContext) { + throw new BlockError( + 'A — multi-stage path constructed inline', + 'Construct this path in the owning `paths.mts` (or a build-infra helper like `getFinalBinaryPath`) and import the computed value here. 1 path, 1 reference.', + call.snippet, + ) + } + } +} + +const checkRuleB = (calls: ReturnType): void => { + for (const call of calls) { + // A sibling package name *immediately after* a `..` literal (no + // path segment in between) plus build context elsewhere in the + // call indicates cross-package traversal. The previous "sticky + // sawDotDot" form fired falsely when '..' appeared early and an + // unrelated sibling-named segment appeared much later. + const hasBuildContext = call.literals.some( + l => BUILD_ROOT_SEGMENTS.has(l) || STAGE_SEGMENTS.has(l), + ) + if (!hasBuildContext) { + continue + } + for (let i = 0; i < call.literals.length - 1; i++) { + if ( + call.literals[i] === '..' && + KNOWN_SIBLING_PACKAGES.has(call.literals[i + 1]!) + ) { + const sibling = call.literals[i + 1]! + throw new BlockError( + 'B — cross-package path traversal', + `Don't reach into '${sibling}'s build output via \`..\`. Add \`${sibling}: workspace:*\` as a dep and import its \`paths.mts\` via the \`exports\` field. 1 path, 1 reference.`, + call.snippet, + ) + } + } + } +} + +// Backtick template-literal detection. Path construction via +// `${buildDir}/out/Final/${binary}` follows the same shape as +// path.join() and constitutes the same Rule A violation. Placeholders +// (${...}) are stripped to a sentinel that won't match any segment +// set, so segments composed entirely of interpolation contribute +// nothing to the trigger. +const TEMPLATE_LITERAL_RE = /`((?:\\.|(?:\$\{(?:[^{}]|\{[^{}]*\})*\})|(?!`)[^\\])*)`/g + +const checkRuleATemplate = (source: string): void => { + TEMPLATE_LITERAL_RE.lastIndex = 0 + let m: RegExpExecArray | null + while ((m = TEMPLATE_LITERAL_RE.exec(source)) !== null) { + const body = m[1] ?? '' + if (!body.includes('/')) { + continue + } + const stripped = body.replace(/\$\{(?:[^{}]|\{[^{}]*\})*\}/g, '\x00') + const segments = stripped + .split('/') + .filter(s => s.length > 0 && s !== '\x00') + const stages = segments.filter(s => STAGE_SEGMENTS.has(s)) + const buildRoots = segments.filter(s => BUILD_ROOT_SEGMENTS.has(s)) + const modes = segments.filter(s => MODE_SEGMENTS.has(s)) + // Template literal trigger is tighter than path.join() because + // backtick strings often appear in patch fixtures, error messages, + // and other multi-line content that incidentally contains stage + // tokens like `wasm`. Require the canonical build-output shape. + const hasBuildAndOut = + buildRoots.includes('build') && buildRoots.includes('out') + const hasOut = buildRoots.includes('out') + const hasBuild = buildRoots.includes('build') + const triggers = + (hasBuildAndOut && stages.length >= 1) || + (stages.length >= 2 && hasOut) || + (hasBuild && stages.length >= 1 && modes.length >= 1) + if (triggers) { + throw new BlockError( + 'A — multi-stage path constructed inline via template literal', + 'Construct this path in the owning `paths.mts` (or a build-infra helper) and import the computed value here. 1 path, 1 reference.', + m[0], + ) + } + } +} + +const check = (source: string): void => { + const calls = extractPathCalls(source) + if (calls.length > 0) { + checkRuleA(calls) + checkRuleB(calls) + } + checkRuleATemplate(source) +} + +const emitBlock = (filePath: string, err: BlockError): void => { + process.stderr.write( + `\n[path-guard] Blocked: ${err.rule}\n` + + ` Mantra: 1 path, 1 reference\n` + + ` File: ${filePath}\n` + + ` Snippet: ${err.snippet}\n` + + ` Fix: ${err.suggestion}\n\n`, + ) +} + +const main = async (): Promise => { + const raw = await stdin() + if (!raw) { + return + } + let payload: ToolInput + try { + payload = JSON.parse(raw) as ToolInput + } catch { + return + } + if (payload.tool_name !== 'Edit' && payload.tool_name !== 'Write') { + return + } + const filePath = payload.tool_input?.file_path ?? '' + if (!isInScope(filePath)) { + return + } + // Edit tool sends `new_string` (the replacement); Write sends + // `content` (the full file). Either is the text we'd be putting on + // disk. + const source = + payload.tool_input?.new_string ?? payload.tool_input?.content ?? '' + if (!source) { + return + } + + try { + check(source) + } catch (e) { + if (e instanceof BlockError) { + emitBlock(filePath, e) + process.exitCode = 2 + return + } + throw e + } +} + +main().catch(e => { + // Never block a tool call due to a bug in the hook itself. Log it + // so we notice, but fail open. + process.stderr.write(`[path-guard] hook error (allowing): ${e}\n`) + process.exitCode = 0 +}) diff --git a/.claude/hooks/path-guard/package.json b/.claude/hooks/path-guard/package.json new file mode 100644 index 000000000..a7cb5039a --- /dev/null +++ b/.claude/hooks/path-guard/package.json @@ -0,0 +1,12 @@ +{ + "name": "hook-path-guard", + "private": true, + "type": "module", + "main": "./index.mts", + "exports": { + ".": "./index.mts" + }, + "scripts": { + "test": "node --test test/*.test.mts" + } +} diff --git a/.claude/hooks/path-guard/segments.mts b/.claude/hooks/path-guard/segments.mts new file mode 100644 index 000000000..e2e4f4b88 --- /dev/null +++ b/.claude/hooks/path-guard/segments.mts @@ -0,0 +1,80 @@ +// Canonical path-segment vocabulary shared by the path-guard hook +// (.claude/hooks/path-guard/index.mts) and gate (scripts/check-paths.mts). +// +// Mantra: 1 path, 1 reference. This module is the *one* place stage, +// build-root, mode, and sibling-package vocabulary is defined. Both +// consumers import from here so they can never drift apart. +// +// Synced byte-identically across the Socket fleet via +// socket-repo-template/scripts/sync-scaffolding.mts (IDENTICAL_FILES). +// When adding a new stage/build-root/mode/sibling, edit this file in +// the template and re-sync. + +// "Stage" segments — Rule A core. Two of these spread via `path.join` +// or interpolated into a template literal is a finding outside a +// canonical `paths.mts`. Sourced from build-infra/lib/constants.mts +// `BUILD_STAGES` plus their lowercase directory-name siblings used by +// some builders. +export const STAGE_SEGMENTS = new Set([ + 'Compressed', + 'downloaded', + 'Final', + 'Optimized', + 'Release', + 'Stripped', + 'Synced', + 'wasm', +]) + +// "Build-root" segments — at least one must be present together with +// a stage segment to confirm we're constructing a build output path +// rather than something coincidental. Example: a join that yields +// `//` doesn't fire if no build-root segment is +// present; `/build//out/` does. +export const BUILD_ROOT_SEGMENTS = new Set(['build', 'out']) + +// Build-mode segments — a stage segment plus one of these is also a +// finding (`build///out/` is the canonical shape). +export const MODE_SEGMENTS = new Set(['dev', 'prod', 'shared']) + +// Sibling fleet packages (Rule B). Union of all packages across the +// Socket fleet — the gate is byte-identical via sync-scaffolding, so +// listing every fleet package keeps Rule B firing in any repo. When a +// new package joins the workspace, add it here and propagate via +// `node scripts/sync-scaffolding.mts --all --fix` from +// socket-repo-template. +export const KNOWN_SIBLING_PACKAGES = new Set([ + // socket-btm + 'bin-infra', + 'binflate', + 'binject', + 'binpress', + 'build-infra', + 'codet5-models-builder', + 'curl-builder', + 'ink-builder', + 'iocraft-builder', + 'libpq-builder', + 'lief-builder', + 'minilm-builder', + 'models', + 'napi-go', + 'node-smol-builder', + 'onnxruntime-builder', + 'opentui-builder', + 'stubs-builder', + 'ultraviolet-builder', + 'yoga-layout-builder', + // socket-cli + 'cli', + 'package-builder', + // socket-tui + 'core', + 'react', + 'renderer', + 'ultraviolet', + 'yoga', + // socket-registry / ultrathink + 'acorn', + 'npm', +]) diff --git a/.claude/hooks/path-guard/test/path-guard.test.mts b/.claude/hooks/path-guard/test/path-guard.test.mts new file mode 100644 index 000000000..a3c3a1324 --- /dev/null +++ b/.claude/hooks/path-guard/test/path-guard.test.mts @@ -0,0 +1,378 @@ +// Tests for the path-guard hook. Each `node:test` block writes a +// mock PreToolUse payload to the hook's stdin and asserts on its exit +// code + stderr. Exit 2 = blocked; exit 0 = allowed. +// +// Run: pnpm --filter hook-path-guard test +// (or directly: node --test test/*.test.mts) + +import { spawnSync } from 'node:child_process' +import path from 'node:path' +import process from 'node:process' +import { fileURLToPath } from 'node:url' + +import { describe, it } from 'node:test' +import assert from 'node:assert/strict' + +const __filename = fileURLToPath(import.meta.url) +const __dirname = path.dirname(__filename) +const HOOK = path.resolve(__dirname, '..', 'index.mts') + +const runHook = ( + toolName: string, + filePath: string, + source: string, +): { code: number; stderr: string } => { + const payload = JSON.stringify({ + tool_name: toolName, + tool_input: + toolName === 'Edit' + ? { file_path: filePath, new_string: source } + : { file_path: filePath, content: source }, + }) + const result = spawnSync(process.execPath, [HOOK], { + encoding: 'utf8', + input: payload, + }) + return { + code: result.status ?? -1, + stderr: result.stderr, + } +} + +describe('path-guard — Rule A (multi-stage construction)', () => { + it('blocks two stage segments in path.join', () => { + const source = ` + const p = path.join(PACKAGE_ROOT, 'wasm', 'out', 'Final', 'bin') + ` + const { code, stderr } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 2) + assert.match(stderr, /Blocked: A/) + assert.match(stderr, /1 path, 1 reference/) + }) + + it('blocks build + mode + stage', () => { + const source = ` + const p = path.join(PKG, 'build', 'dev', 'out', 'Final', 'binary') + ` + const { code } = runHook( + 'Edit', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 2) + }) + + it('blocks Release + Stripped together', () => { + const source = ` + const p = path.join(buildDir, 'Release', 'Stripped') + ` + const { code } = runHook( + 'Write', + 'packages/foo/scripts/release.mts', + source, + ) + assert.equal(code, 2) + }) + + it('allows single stage segment with one build root', () => { + // 'build' + 'temp' → no stage segment at all → pass + const source = ` + const tmp = path.join(packageRoot, 'build', 'temp') + ` + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 0) + }) + + it('allows path.join with no stage segments', () => { + const source = ` + const cfg = path.join(packageRoot, 'config', 'settings.json') + ` + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 0) + }) +}) + +describe('path-guard — Rule B (cross-package traversal)', () => { + it('blocks .. + sibling package + build context', () => { + const source = ` + const lief = path.join(PKG, '..', 'lief-builder', 'build', 'Final') + ` + const { code, stderr } = runHook( + 'Write', + 'packages/binject/scripts/build.mts', + source, + ) + assert.equal(code, 2) + assert.match(stderr, /Blocked: B/) + assert.match(stderr, /lief-builder/) + }) + + it('allows .. + sibling without build context', () => { + // Reaching into a sibling for a non-build asset is allowed; the + // gate may still flag it but the hook is scoped to build paths. + const source = ` + const cfg = path.join(PKG, '..', 'lief-builder', 'config.json') + ` + const { code } = runHook( + 'Write', + 'packages/binject/scripts/build.mts', + source, + ) + assert.equal(code, 0) + }) + + it('does not fire on traversal to unknown directory', () => { + const source = ` + const x = path.join(PKG, '..', 'fixtures', 'build', 'Final') + ` + const { code } = runHook( + 'Write', + 'packages/foo/test/test.mts', + source, + ) + assert.equal(code, 0) + }) + + it('does not fire when .. and sibling are non-adjacent (regression)', () => { + // Earlier regex ran with sticky sawDotDot — once it saw `..` it + // would flag any later sibling-named segment. The fix requires + // the sibling to appear *immediately* after `..`. + const source = ` + const x = path.join(PKG, '..', 'cache', 'lief-builder', 'config.json') + ` + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 0) + }) +}) + +describe('path-guard — paren-balance correctness', () => { + it('detects A through nested function-call args (regression)', () => { + // Old regex used \\([^()]*\\) which only handled one nesting + // level — `path.join(getDir(child(x)), 'build', 'dev', 'Final')` + // silently slipped through. The paren-balancing scanner catches it. + const source = ` + const p = path.join(getDir(child(x)), 'build', 'dev', 'out', 'Final') + ` + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 2) + }) + + it('detects A in path.resolve() too', () => { + const source = ` + const p = path.resolve(PKG, 'build', 'dev', 'out', 'Final', 'bin') + ` + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 2) + }) +}) + +describe('path-guard — template literals', () => { + it('detects A in fully-literal template path', () => { + const source = '\n const p = `build/dev/out/Final/binary`\n ' + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 2) + }) + + it('detects A in template with placeholders', () => { + const source = + '\n const p = `${PKG}/build/${mode}/${arch}/out/Final/${name}`\n ' + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 2) + }) + + it('allows template with single non-stage segment', () => { + const source = '\n const url = `https://example.com/path`\n ' + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 0) + }) + + it('allows template with no stage segments', () => { + const source = '\n const tmp = `${packageRoot}/build/temp/cache`\n ' + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 0) + }) + + it('allows template that is purely interpolation', () => { + // `${a}/${b}/${c}` has no literal stage segments. + const source = '\n const p = `${a}/${b}/${c}`\n ' + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 0) + }) +}) + +describe('path-guard — file-type filter', () => { + it('skips .ts files', () => { + const source = ` + const p = path.join(PKG, 'build', 'dev', 'out', 'Final', 'bin') + ` + const { code } = runHook('Write', 'packages/foo/src/index.ts', source) + assert.equal(code, 0) + }) + + it('skips .mjs files', () => { + const source = ` + const p = path.join(PKG, 'build', 'dev', 'out', 'Final', 'bin') + ` + const { code } = runHook('Write', 'additions/foo.mjs', source) + assert.equal(code, 0) + }) + + it('skips .yml files', () => { + const source = ` + run: | + FINAL="build/\${MODE}/\${ARCH}/out/Final" + ` + const { code } = runHook( + 'Write', + '.github/workflows/foo.yml', + source, + ) + assert.equal(code, 0) + }) + + it('inspects .mts files', () => { + const source = ` + const p = path.join(PKG, 'build', 'dev', 'out', 'Final', 'bin') + ` + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.mts', + source, + ) + assert.equal(code, 2) + }) + + it('inspects .cts files', () => { + const source = ` + const p = path.join(PKG, 'build', 'dev', 'out', 'Final', 'bin') + ` + const { code } = runHook( + 'Write', + 'packages/foo/scripts/build.cts', + source, + ) + assert.equal(code, 2) + }) +}) + +describe('path-guard — exempt files', () => { + it('allows edits to paths.mts', () => { + const source = ` + export const FINAL_DIR = path.join(PKG, 'build', 'dev', 'out', 'Final') + ` + const { code } = runHook( + 'Write', + 'packages/foo/scripts/paths.mts', + source, + ) + assert.equal(code, 0) + }) + + it('allows edits to check-paths.mts (the gate)', () => { + const source = ` + const PATTERNS = [path.join('build', 'Final', 'wasm')] + ` + const { code } = runHook('Write', 'scripts/check-paths.mts', source) + assert.equal(code, 0) + }) + + it('allows edits to the path-guard hook itself', () => { + const source = ` + const STAGES = ['Final', 'Release', 'Stripped'] + ` + const { code } = runHook( + 'Write', + '.claude/hooks/path-guard/index.mts', + source, + ) + assert.equal(code, 0) + }) + + it('allows edits to path-guard tests', () => { + const source = ` + const fixture = path.join('build', 'dev', 'out', 'Final') + ` + const { code } = runHook( + 'Write', + '.claude/hooks/path-guard/test/path-guard.test.mts', + source, + ) + assert.equal(code, 0) + }) +}) + +describe('path-guard — tool-name filter', () => { + it('skips Bash', () => { + const source = `path.join(PKG, 'build', 'dev', 'out', 'Final', 'bin')` + const { code } = runHook('Bash', '', source) + assert.equal(code, 0) + }) + + it('skips Read', () => { + const source = '' + const { code } = runHook('Read', 'packages/foo/scripts/build.mts', source) + assert.equal(code, 0) + }) +}) + +describe('path-guard — bug-tolerance (fails open)', () => { + it('passes through invalid JSON payload', () => { + const result = spawnSync(process.execPath, [HOOK], { + encoding: 'utf8', + input: 'not json at all', + }) + assert.equal(result.status, 0) + }) + + it('passes through empty stdin', () => { + const result = spawnSync(process.execPath, [HOOK], { + encoding: 'utf8', + input: '', + }) + assert.equal(result.status, 0) + }) +}) diff --git a/.claude/hooks/path-guard/tsconfig.json b/.claude/hooks/path-guard/tsconfig.json new file mode 100644 index 000000000..53c5c8475 --- /dev/null +++ b/.claude/hooks/path-guard/tsconfig.json @@ -0,0 +1,15 @@ +{ + "compilerOptions": { + "declarationMap": false, + "erasableSyntaxOnly": true, + "module": "nodenext", + "moduleResolution": "nodenext", + "noEmit": true, + "rewriteRelativeImportExtensions": true, + "skipLibCheck": true, + "sourceMap": false, + "strict": true, + "target": "esnext", + "verbatimModuleSyntax": true + } +} diff --git a/.claude/hooks/private-name-guard/README.md b/.claude/hooks/private-name-guard/README.md new file mode 100644 index 000000000..a15d4412e --- /dev/null +++ b/.claude/hooks/private-name-guard/README.md @@ -0,0 +1,59 @@ +# private-name-guard + +`PreToolUse` hook that **never blocks**. On every `Bash` command that +would publish text to a public Git/GitHub surface, writes a short +reminder to stderr so the model re-reads the command with the rule +freshly in mind: + +> No private repos or internal project names in public surfaces. Omit +> the reference entirely — don't substitute a placeholder. The +> placeholder itself is a tell. + +Attention priming, not enforcement. The model is responsible for +applying the rule — the hook just ensures the rule is in the active +context at the moment the command is about to fire. + +Sibling to `public-surface-reminder`, which covers customer/company +names and internal work-item IDs. The two hooks compose: both fire on +the same public-surface commands, each priming a distinct slice of the +rule set. + +## What counts as "public surface" + +- `git commit` (including `--amend`) +- `git push` +- `gh pr (create|edit|comment|review)` +- `gh issue (create|edit|comment)` +- `gh api -X POST|PATCH|PUT` +- `gh release (create|edit)` + +Any other `Bash` command passes through silently. + +## Why no denylist + +Because a denylist is itself a leak. A file named `private-projects.txt` +that enumerates "these are our internal repos" is worse than no list at +all — anyone who finds it gets the org's full internal map for free. +Recognition happens at write time, every time, by the model reading the +text it's about to send. The hook just makes sure that read happens. + +## Wiring + +`.claude/settings.json`: + +```json +{ + "hooks": { + "PreToolUse": [ + { + "matcher": "Bash", + "hooks": [{ "type": "command", "command": "node .claude/hooks/private-name-guard/index.mts" }] + } + ] + } +} +``` + +## Exit code + +Always `0`. The hook never blocks; it only prints to stderr. diff --git a/.claude/hooks/private-name-guard/index.mts b/.claude/hooks/private-name-guard/index.mts new file mode 100644 index 000000000..fa8ea3239 --- /dev/null +++ b/.claude/hooks/private-name-guard/index.mts @@ -0,0 +1,89 @@ +#!/usr/bin/env node +// Claude Code PreToolUse hook — private-name guard. +// +// Never blocks. On every Bash command that would publish text to a public +// Git/GitHub surface (git commit, git push, gh pr/issue/api/release write), +// writes a short reminder to stderr so the model re-reads the command with +// the rule freshly in mind: +// +// No private repos or internal project names in public surfaces. +// Omit the reference entirely — don't substitute a placeholder. +// +// Exit code is always 0. This is attention priming, not enforcement. The +// model is responsible for applying the rule — the hook just makes sure +// the rule is in the active context at the moment the command is about +// to fire. +// +// Deliberately carries no enumerated denylist. Recognition and replacement +// happen at write time, not via a list of names. A denylist is itself a +// leak — a file named `private-projects.txt` would be the very thing it +// tries to prevent. +// +// Reads a Claude Code PreToolUse JSON payload from stdin: +// { "tool_name": "Bash", "tool_input": { "command": "..." } } + +import { readFileSync } from 'node:fs' + +type ToolInput = { + tool_name?: string + tool_input?: { + command?: string + } +} + +// Commands that can publish content outside the local machine. +// Keep broad — better to remind on an extra read than miss a write. +const PUBLIC_SURFACE_PATTERNS: RegExp[] = [ + /\bgit\s+commit\b/, + /\bgit\s+push\b/, + /\bgh\s+pr\s+(create|edit|comment|review)\b/, + /\bgh\s+issue\s+(create|edit|comment)\b/, + /\bgh\s+api\b[^|]*-X\s*(POST|PATCH|PUT)\b/i, + /\bgh\s+release\s+(create|edit)\b/, +] + +function isPublicSurface(command: string): boolean { + const normalized = command.replace(/\s+/g, ' ') + return PUBLIC_SURFACE_PATTERNS.some(re => re.test(normalized)) +} + +function main(): void { + let raw = '' + try { + raw = readFileSync(0, 'utf8') + } catch { + return + } + + let input: ToolInput + try { + input = JSON.parse(raw) + } catch { + return + } + + if (input.tool_name !== 'Bash') { + return + } + const command = input.tool_input?.command + if (!command || typeof command !== 'string') { + return + } + if (!isPublicSurface(command)) { + return + } + + const lines = [ + '[private-name-guard] This command writes to a public Git/GitHub surface.', + ' • Re-read the commit message / PR body / comment BEFORE it sends.', + ' • No private repo names. No internal project codenames. No unreleased', + ' product names. No internal-only tooling repos absent from the public', + ' org page. No customer/partner names.', + ' • Omit the reference entirely. Do not substitute a placeholder — the', + ' placeholder itself is a tell.', + ' • If you spot one, cancel and rewrite the text first.', + ] + process.stderr.write(lines.join('\n') + '\n') +} + +main() diff --git a/.claude/hooks/private-name-guard/package.json b/.claude/hooks/private-name-guard/package.json new file mode 100644 index 000000000..64c972ff1 --- /dev/null +++ b/.claude/hooks/private-name-guard/package.json @@ -0,0 +1,12 @@ +{ + "name": "hook-private-name-guard", + "private": true, + "type": "module", + "main": "./index.mts", + "exports": { + ".": "./index.mts" + }, + "devDependencies": { + "@types/node": "24.9.2" + } +} diff --git a/.claude/hooks/public-surface-reminder/package.json b/.claude/hooks/public-surface-reminder/package.json index cee721e2c..09cf63ac9 100644 --- a/.claude/hooks/public-surface-reminder/package.json +++ b/.claude/hooks/public-surface-reminder/package.json @@ -1,5 +1,5 @@ { - "name": "@socketsecurity/hook-public-surface-reminder", + "name": "hook-public-surface-reminder", "private": true, "type": "module", "main": "./index.mts", diff --git a/.claude/hooks/release-workflow-guard/README.md b/.claude/hooks/release-workflow-guard/README.md new file mode 100644 index 000000000..6f274ae55 --- /dev/null +++ b/.claude/hooks/release-workflow-guard/README.md @@ -0,0 +1,68 @@ +# release-workflow-guard + +`PreToolUse` hook that **blocks** every Bash command that would +dispatch a GitHub Actions workflow. Exit code `2`; the model never +gets to fire the command. + +> Workflow dispatches are irrevocable. Publish workflows push npm +> versions (unpublishable after 24h). Build/Release workflows pin +> GitHub releases by SHA. Container workflows push immutable image +> tags. Even build workflows with a `dry_run` input still treat the +> dispatch itself as the prod trigger — the user runs them +> manually, never Claude. + +## What gets blocked + +- `gh workflow run ` +- `gh workflow dispatch ` (alias of `run`) +- `gh api .../actions/workflows//dispatches` POST/PUT + +Any other `Bash` command passes through silently. + +## Why no per-workflow allowlist + +Because allowlists drift. A "benign" CI dispatch today becomes a +prod-touching dispatch tomorrow when someone wires a publish step +behind it; the allowlist hasn't updated. The cost of an extra +block is one re-prompt (the user runs the command in their own +terminal). The cost of a missed prod dispatch is irreversible. +Block all dispatches; let the user judge. + +## Override + +There is no opt-out. If a real workflow id needs dispatching during +a Claude session, the user runs it themselves — either in a plain +shell, via the GitHub Actions UI, or by typing `! gh workflow run +...` outside of a Claude prompt where the hook doesn't fire. + +## Wiring + +`.claude/settings.json`: + +```json +{ + "hooks": { + "PreToolUse": [ + { + "matcher": "Bash", + "hooks": [{ "type": "command", "command": "node .claude/hooks/release-workflow-guard/index.mts" }] + } + ] + } +} +``` + +## Exit code + +- `0` — command is not a workflow dispatch; pass through +- `2` — command is a workflow dispatch; block + write reason to stderr + +## Sibling hooks + +- `private-name-guard` — primes the model on private repo / project names +- `public-surface-reminder` — primes on customer / company names +- `token-guard` — blocks token-leaking shell shapes + +`release-workflow-guard` is the third hook that **blocks** rather +than primes (alongside `token-guard` and `path-guard`). The shared +rule: block when the harm of a wrong fire is irreversible. diff --git a/.claude/hooks/release-workflow-guard/index.mts b/.claude/hooks/release-workflow-guard/index.mts new file mode 100644 index 000000000..b824c3f5f --- /dev/null +++ b/.claude/hooks/release-workflow-guard/index.mts @@ -0,0 +1,137 @@ +#!/usr/bin/env node +// Claude Code PreToolUse hook — release-workflow-guard. +// +// BLOCKS every Bash command that would dispatch a GitHub Actions +// workflow. The user runs workflow_dispatch jobs manually after +// reviewing the release commit and waiting for CI to pass — +// auto-triggering is irrevocable in the short term: +// +// - Publish workflows push npm versions (unpublishable after 24h). +// - Build/Release workflows publish GitHub releases pinned by SHA. +// - Container workflows push immutable image tags. +// +// Even nominally-CI workflow_dispatches often carry prod side +// effects (the socket-btm binary builders gate prod releases on a +// `dry_run` input, but the dispatch itself is the trigger). The +// safe default is "block all dispatches and ask the user to run +// them themselves." Cost of an extra block: one re-prompt. Cost +// of a missed prod publish: irreversible. +// +// Exit code 2 with a clear stderr message stops the tool call. The +// model never gets to fire the command. The user re-runs it from +// their own terminal (or via the GitHub Actions UI) when ready. +// +// Blocked patterns: +// - `gh workflow run ` +// - `gh workflow dispatch ` (alias of `run`) +// - `gh api ... actions/workflows//dispatches` POST/PUT +// +// This hook is the enforcement layer paired with the CLAUDE.md +// rule. The rule documents the policy; the hook makes it +// mechanical so the model can't accidentally dispatch a workflow +// even when reasoning about urgent release work. +// +// Reads a Claude Code PreToolUse JSON payload from stdin: +// { "tool_name": "Bash", "tool_input": { "command": "..." } } + +import { readFileSync } from 'node:fs' +import process from 'node:process' + +type ToolInput = { + tool_name?: string + tool_input?: { + command?: string + } +} + +// `gh workflow run ` / `gh workflow dispatch `. +// The captured workflow argument is reported back so the user can +// see what was blocked. +const GH_WORKFLOW_DISPATCH_RE = + /\bgh\s+workflow\s+(?:run|dispatch)\b(?:\s+(?:--repo|--ref|-f|--field)\s+\S+)*\s+(['"]?)([^\s'"]+)\1/ + +// `gh api .../actions/workflows//dispatches` (POST/PUT). +// The path component implies dispatch — no need to also match -X. +const GH_API_WORKFLOW_DISPATCH_RE = + /\bgh\s+api\b[^|]*?\/actions\/workflows\/([^/\s]+)\/dispatches\b/ + +function detectDispatch(command: string): { + blocked: boolean + workflow?: string + shape?: string +} { + const normalized = command.replace(/\s+/g, ' ') + + const cliMatch = GH_WORKFLOW_DISPATCH_RE.exec(normalized) + if (cliMatch) { + return { + blocked: true, + workflow: cliMatch[2], + shape: 'gh workflow run/dispatch', + } + } + + const apiMatch = GH_API_WORKFLOW_DISPATCH_RE.exec(normalized) + if (apiMatch) { + return { + blocked: true, + workflow: apiMatch[1], + shape: 'gh api .../dispatches', + } + } + + return { blocked: false } +} + +function main(): void { + let raw = '' + try { + raw = readFileSync(0, 'utf8') + } catch { + return + } + + let input: ToolInput + try { + input = JSON.parse(raw) + } catch { + return + } + + if (input.tool_name !== 'Bash') { + return + } + const command = input.tool_input?.command + if (!command || typeof command !== 'string') { + return + } + + const { blocked, workflow, shape } = detectDispatch(command) + if (!blocked) { + return + } + + const lines = [ + '[release-workflow-guard] BLOCKED: this command would dispatch a', + ` GitHub Actions workflow (${shape}, target: ${workflow ?? ''}).`, + '', + ' Workflow dispatches often have irreversible prod side effects:', + ' - Publish workflows push npm versions (unpublishable after 24h).', + ' - Build/Release workflows create GitHub releases pinned by SHA.', + ' - Container workflows push immutable image tags.', + " - Even build workflows with a 'dry_run' input still treat the", + ' dispatch itself as the prod trigger.', + '', + ' The user runs workflow_dispatch jobs manually — never Claude.', + ' Tell the user to run the command in their own terminal (or', + ' via the GitHub Actions UI), then resume.', + '', + ' This hook has no opt-out. If you genuinely need to run a', + ' benign dispatch (e.g. a debug-only utility workflow), ask', + " the user to invoke it themselves; don't seek a bypass here.", + ] + process.stderr.write(lines.join('\n') + '\n') + process.exitCode = 2 +} + +main() diff --git a/.claude/hooks/release-workflow-guard/package.json b/.claude/hooks/release-workflow-guard/package.json new file mode 100644 index 000000000..7e19aa139 --- /dev/null +++ b/.claude/hooks/release-workflow-guard/package.json @@ -0,0 +1,12 @@ +{ + "name": "hook-release-workflow-guard", + "private": true, + "type": "module", + "main": "./index.mts", + "exports": { + ".": "./index.mts" + }, + "devDependencies": { + "@types/node": "24.9.2" + } +} diff --git a/.claude/hooks/setup-security-tools/README.md b/.claude/hooks/setup-security-tools/README.md index 02fde4a7c..ed20c5e75 100644 --- a/.claude/hooks/setup-security-tools/README.md +++ b/.claude/hooks/setup-security-tools/README.md @@ -5,21 +5,24 @@ Sets up all three Socket security tools for local development in one command. ## Tools ### 1. AgentShield + Scans your Claude Code configuration (`.claude/` directory) for security issues like prompt injection, leaked secrets, and overly permissive tool permissions. -**How it's installed**: Already a devDependency (`ecc-agentshield`). The setup script just verifies it's available — if not, run `pnpm install`. +**How it's installed**: npm package downloaded via the dlx system (pinned version + integrity hash from `external-tools.json`), cached at `~/.socket/_dlx/`. Subsequent runs reuse the cache. No `devDependencies` entry required in the consumer repo. ### 2. Zizmor + Static analysis tool for GitHub Actions workflows. Catches unpinned actions, secret exposure, template injection, and permission issues. **How it's installed**: Binary downloaded from [GitHub releases](https://github.com/zizmorcore/zizmor/releases), SHA-256 verified, cached via the dlx system at `~/.socket/_dlx/`. If you already have it via `brew install zizmor`, the download is skipped. ### 3. SFW (Socket Firewall) + Intercepts package manager commands (`npm install`, `pnpm add`, etc.) and scans packages against Socket.dev's malware database before installation. **How it's installed**: Binary downloaded from GitHub, SHA-256 verified, cached via the dlx system at `~/.socket/_dlx/`. Small wrapper scripts ("shims") are created at `~/.socket/sfw/shims/` that transparently route commands through the firewall. -**Free vs Enterprise**: If you have a `SOCKET_API_KEY` (in env, `.env`, or `.env.local`), enterprise mode is used with additional ecosystem support (gem, bundler, nuget, go). Otherwise, free mode covers npm, yarn, pnpm, pip, uv, and cargo. +**Free vs Enterprise**: If you have a `SOCKET_API_KEY` (in env, `.env`, or `.env.local`), enterprise mode is used with additional ecosystem support (gem, bundler, nuget, go). Otherwise, free mode covers npm, yarn, pnpm, pip, pip3, uv, and cargo. ## How to use @@ -31,16 +34,17 @@ Claude will ask if you have an API key, then run the setup script. ## What gets installed where -| Tool | Location | Persists across repos? | -|------|----------|----------------------| -| AgentShield | `node_modules/.bin/agentshield` | No (per-repo devDep) | -| Zizmor | `~/.socket/_dlx//zizmor` | Yes | -| SFW binary | `~/.socket/_dlx//sfw` | Yes | -| SFW shims | `~/.socket/sfw/shims/npm`, etc. | Yes | +| Tool | Location | Persists across repos? | +| ----------- | ----------------------------- | ---------------------- | +| AgentShield | `~/.socket/_dlx//agentshield` | Yes | +| Zizmor | `~/.socket/_dlx//zizmor` | Yes | +| SFW binary | `~/.socket/_dlx//sfw` | Yes | +| SFW shims | `~/.socket/sfw/shims/npm`, etc. | Yes | ## Pre-push integration The `.git-hooks/pre-push` hook automatically runs: + - **AgentShield scan** (blocks push on failure) - **Zizmor scan** (blocks push on failure) @@ -49,7 +53,8 @@ This means every push is checked — you don't have to remember to run `/securit ## Re-running Safe to run multiple times: -- AgentShield: just re-checks availability + +- AgentShield: skips download if cached binary matches the pinned version - Zizmor: skips download if cached binary matches expected version - SFW: skips download if cached, only rewrites shims if content changed @@ -58,16 +63,16 @@ Safe to run multiple times: Self-contained. To add to another Socket repo: 1. Copy `.claude/hooks/setup-security-tools/` and `.claude/commands/setup-security-tools.md` -2. Run `cd .claude/hooks/setup-security-tools && npm install` +2. Ensure the consumer repo has `@socketsecurity/lib`, `@socketregistry/packageurl-js`, and `@sinclair/typebox` available (via workspace catalog or direct deps) 3. Ensure `.claude/hooks/` is not gitignored (add `!/.claude/hooks/` to `.gitignore`) -4. Ensure `ecc-agentshield` is a devDep in the target repo +4. Run `pnpm install` in the consumer repo so the hook's workspace deps resolve ## Troubleshooting -**"AgentShield not found"** — Run `pnpm install`. It's the `ecc-agentshield` devDependency. +**"AgentShield install failed"** — Check network access to npm registry. The dlx system caches at `~/.socket/_dlx/`; clear the cache (`rm -rf ~/.socket/_dlx/`) to force a fresh download. **"zizmor found but wrong version"** — The script downloads the expected version via the dlx cache. Your system version (e.g. from brew) will be ignored in favor of the correct version. **"No supported package managers found"** — SFW only creates shims for package managers found on your PATH. Install npm/pnpm/etc. first. -**SFW shims not intercepting** — Make sure `~/.socket/sfw/shims` is at the *front* of PATH. Run `which npm` — it should point to the shim, not the real binary. +**SFW shims not intercepting** — Make sure `~/.socket/sfw/shims` is at the _front_ of PATH. Run `which npm` — it should point to the shim, not the real binary. diff --git a/.claude/hooks/setup-security-tools/external-tools.json b/.claude/hooks/setup-security-tools/external-tools.json index f8f6d2a2f..3e2381dbe 100644 --- a/.claude/hooks/setup-security-tools/external-tools.json +++ b/.claude/hooks/setup-security-tools/external-tools.json @@ -1,6 +1,11 @@ { "description": "Security tools for Claude Code hooks (self-contained, no external deps)", "tools": { + "agentshield": { + "description": "Claude AI config security scanner (prompt injection, secrets)", + "purl": "pkg:npm/ecc-agentshield@1.4.0", + "integrity": "sha512-R98OO1Ujyk2lezDLb+iQmMhF6FwTJCHajy3G4FCB6x7wkSTqR9f8+eAelC5KDzYDsGSbc0sOZvjXOOPRBtMpDg==" + }, "zizmor": { "description": "GitHub Actions security scanner", "version": "1.23.1", @@ -56,7 +61,7 @@ "sha256": "c953e62ad7928d4d8f2302f5737884ea1a757babc26bed6a42b9b6b68a5d54af" } }, - "ecosystems": ["npm", "yarn", "pnpm", "pip", "uv", "cargo"] + "ecosystems": ["npm", "yarn", "pnpm", "pip", "pip3", "uv", "cargo"] }, "sfw-enterprise": { "description": "Socket Firewall (enterprise tier)", @@ -85,7 +90,7 @@ "sha256": "9a50e1ddaf038138c3f85418dc5df0113bbe6fc884f5abe158beaa9aea18d70a" } }, - "ecosystems": ["npm", "yarn", "pnpm", "pip", "uv", "cargo", "gem", "bundler", "nuget"] + "ecosystems": ["npm", "yarn", "pnpm", "pip", "pip3", "uv", "cargo", "gem", "bundler", "nuget"] } } } diff --git a/.claude/hooks/setup-security-tools/index.mts b/.claude/hooks/setup-security-tools/index.mts index ad5e31e6e..2ae06e645 100644 --- a/.claude/hooks/setup-security-tools/index.mts +++ b/.claude/hooks/setup-security-tools/index.mts @@ -3,51 +3,64 @@ // // Configures three tools: // 1. AgentShield — scans Claude AI config for prompt injection / secrets. -// Already a devDep (ecc-agentshield); this script verifies it's installed. +// Downloaded as npm package via dlx (pinned version, cached). // 2. Zizmor — static analysis for GitHub Actions workflows. Downloads the // correct binary, verifies SHA-256, cached via the dlx system. // 3. SFW (Socket Firewall) — intercepts package manager commands to scan // for malware. Downloads binary, verifies SHA-256, creates PATH shims. // Enterprise vs free determined by SOCKET_API_KEY in env / .env / .env.local. -import { existsSync, readFileSync, promises as fs } from 'node:fs' +import { existsSync, promises as fs, readFileSync } from 'node:fs' import { tmpdir } from 'node:os' import path from 'node:path' import process from 'node:process' import { fileURLToPath } from 'node:url' +import { PackageURL } from '@socketregistry/packageurl-js' +import { Type } from '@sinclair/typebox' + import { whichSync } from '@socketsecurity/lib/bin' import { downloadBinary } from '@socketsecurity/lib/dlx/binary' +import { downloadPackage } from '@socketsecurity/lib/dlx/package' +import { errorMessage } from '@socketsecurity/lib/errors' import { safeDelete } from '@socketsecurity/lib/fs' import { getDefaultLogger } from '@socketsecurity/lib/logger' +import { normalizePath } from '@socketsecurity/lib/paths/normalize' import { getSocketHomePath } from '@socketsecurity/lib/paths/socket' -import { spawn, spawnSync } from '@socketsecurity/lib/spawn' -import { z } from 'zod' +import { spawn } from '@socketsecurity/lib/spawn' +import { parseSchema } from '@socketsecurity/lib/schema/parse' const logger = getDefaultLogger() // ── Tool config loaded from external-tools.json (self-contained) ── -const toolSchema = z.object({ - description: z.string().optional(), - version: z.string(), - repository: z.string().optional(), - assets: z.record(z.string(), z.string()).optional(), - platforms: z.record(z.string(), z.string()).optional(), - checksums: z.record(z.string(), z.string()).optional(), - ecosystems: z.array(z.string()).optional(), +const checksumEntrySchema = Type.Object({ + asset: Type.String(), + sha256: Type.String(), +}) + +const toolSchema = Type.Object({ + description: Type.Optional(Type.String()), + version: Type.Optional(Type.String()), + purl: Type.Optional(Type.String()), + integrity: Type.Optional(Type.String()), + repository: Type.Optional(Type.String()), + release: Type.Optional(Type.String()), + checksums: Type.Optional(Type.Record(Type.String(), checksumEntrySchema)), + ecosystems: Type.Optional(Type.Array(Type.String())), }) -const configSchema = z.object({ - description: z.string().optional(), - tools: z.record(z.string(), toolSchema), +const configSchema = Type.Object({ + description: Type.Optional(Type.String()), + tools: Type.Record(Type.String(), toolSchema), }) const __dirname = path.dirname(fileURLToPath(import.meta.url)) const configPath = path.join(__dirname, 'external-tools.json') const rawConfig = JSON.parse(readFileSync(configPath, 'utf8')) -const config = configSchema.parse(rawConfig) +const config = parseSchema(configSchema, rawConfig) +const AGENTSHIELD = config.tools['agentshield']! const ZIZMOR = config.tools['zizmor']! const SFW_FREE = config.tools['sfw-free']! const SFW_ENTERPRISE = config.tools['sfw-enterprise']! @@ -79,19 +92,37 @@ function findApiKey(): string | undefined { // ── AgentShield ── -function setupAgentShield(): boolean { +async function setupAgentShield(): Promise { logger.log('=== AgentShield ===') - const bin = whichSync('agentshield', { nothrow: true }) - if (bin && typeof bin === 'string') { - const result = spawnSync(bin, ['--version'], { stdio: 'pipe' }) + const purl = PackageURL.fromString(AGENTSHIELD.purl!) + if (purl.type !== 'npm') { + throw new Error(`Unsupported PURL type "${purl.type}" — only npm is supported`) + } + const npmPackage = purl.namespace ? `${purl.namespace}/${purl.name}` : purl.name! + const version = AGENTSHIELD.version ?? purl.version + const packageSpec = version ? `${npmPackage}@${version}` : npmPackage + + logger.log(`Installing ${packageSpec} via dlx...`) + const { binaryPath, installed } = await downloadPackage({ + package: packageSpec, + binaryName: 'agentshield', + }) + + // Verify version matches pinned config. + if (version) { + const result = await spawn(binaryPath, ['--version'], { stdio: 'pipe' }) const ver = typeof result.stdout === 'string' ? result.stdout.trim() : result.stdout.toString().trim() - logger.log(`Found: ${bin} (${ver})`) - return true + if (!ver.includes(version)) { + logger.warn(`Version mismatch: expected ${version}, got ${ver}`) + return false + } + logger.log(installed ? `Installed: ${binaryPath} (${ver})` : `Cached: ${binaryPath} (${ver})`) + } else { + logger.log(installed ? `Installed: ${binaryPath}` : `Cached: ${binaryPath}`) } - logger.warn('Not found. Run "pnpm install" to install ecc-agentshield.') - return false + return true } // ── Zizmor ── @@ -148,8 +179,8 @@ async function setupZizmor(): Promise { } const isZip = asset.endsWith('.zip') - const extractDir = path.join(tmpdir(), `zizmor-extract-${Date.now()}`) - await fs.mkdir(extractDir, { recursive: true }) + // mkdtemp is collision-safe, unlike Date.now()-only naming. + const extractDir = await fs.mkdtemp(path.join(tmpdir(), 'zizmor-extract-')) try { if (isZip) { await spawn('powershell', ['-NoProfile', '-Command', @@ -195,6 +226,7 @@ async function setupSfw(apiKey: string | undefined): Promise { // Create shims. const isWindows = process.platform === 'win32' + const shimDir = path.join(getSocketHomePath(), 'sfw', 'shims') await fs.mkdir(shimDir, { recursive: true }) const ecosystems = [...(sfwConfig.ecosystems ?? [])] @@ -203,12 +235,14 @@ async function setupSfw(apiKey: string | undefined): Promise { } const cleanPath = (process.env['PATH'] ?? '').split(path.delimiter) .filter(p => p !== shimDir).join(path.delimiter) + const sfwBin = normalizePath(binaryPath) const created: string[] = [] for (const cmd of ecosystems) { - const realBin = whichSync(cmd, { nothrow: true, path: cleanPath }) + let realBin = whichSync(cmd, { nothrow: true, path: cleanPath }) if (!realBin || typeof realBin !== 'string') continue + realBin = normalizePath(realBin) - // Bash shim (macOS/Linux). + // Bash shim (macOS/Linux/Windows Git Bash). const bashLines = [ '#!/bin/bash', `export PATH="$(echo "$PATH" | tr ':' '\\n' | grep -vxF '${shimDir}' | paste -sd: -)"`, @@ -227,7 +261,7 @@ async function setupSfw(apiKey: string | undefined): Promise { 'fi', ) } - bashLines.push(`exec "${binaryPath}" "${realBin}" "$@"`) + bashLines.push(`exec "${sfwBin}" "${realBin}" "$@"`) const bashContent = bashLines.join('\n') + '\n' const bashPath = path.join(shimDir, cmd) if (!existsSync(bashPath) || await fs.readFile(bashPath, 'utf8').catch(() => '') !== bashContent) { @@ -257,7 +291,7 @@ async function setupSfw(apiKey: string | undefined): Promise { + `set "PATH=%PATH:;${shimDir};=%"\r\n` + `set "PATH=%PATH:~1,-1%"\r\n` + cmdApiKeyBlock - + `"${binaryPath}" "${realBin}" %*\r\n` + + `"${sfwBin}" "${realBin}" %*\r\n` const cmdPath = path.join(shimDir, `${cmd}.cmd`) if (!existsSync(cmdPath) || await fs.readFile(cmdPath, 'utf8').catch(() => '') !== cmdContent) { await fs.writeFile(cmdPath, cmdContent) @@ -282,7 +316,7 @@ async function main(): Promise { const apiKey = findApiKey() - const agentshieldOk = setupAgentShield() + const agentshieldOk = await setupAgentShield() logger.log('') const zizmorOk = await setupZizmor() logger.log('') @@ -302,6 +336,6 @@ async function main(): Promise { } main().catch((e: unknown) => { - logger.error(e instanceof Error ? e.message : String(e)) + logger.error(errorMessage(e)) process.exitCode = 1 }) diff --git a/.claude/hooks/setup-security-tools/package.json b/.claude/hooks/setup-security-tools/package.json index f8ecc76cb..3bc122f66 100644 --- a/.claude/hooks/setup-security-tools/package.json +++ b/.claude/hooks/setup-security-tools/package.json @@ -1,9 +1,11 @@ { - "name": "@socketsecurity/hook-setup-security-tools", + "name": "hook-setup-security-tools", "private": true, "type": "module", "main": "./index.mts", "dependencies": { + "@sinclair/typebox": "catalog:", + "@socketregistry/packageurl-js": "catalog:", "@socketsecurity/lib": "catalog:" } } diff --git a/.claude/hooks/stale-process-sweeper/README.md b/.claude/hooks/stale-process-sweeper/README.md new file mode 100644 index 000000000..38d966748 --- /dev/null +++ b/.claude/hooks/stale-process-sweeper/README.md @@ -0,0 +1,74 @@ +# stale-process-sweeper + +Claude Code `Stop` hook that sweeps stale Node test/build worker +processes at turn-end, before they pile up across turns and exhaust +system memory. + +## Why + +Vitest's `forks` pool spawns one Node worker per CPU. When the parent +runner exits abnormally — `Bash` timeout, `SIGINT` from the user, +pre-commit hook crash — the workers stay alive holding 80–100 MB +each. After a few interrupted runs the host has gigabytes of +abandoned processes. + +The sweeper finds those processes (matched by command-line pattern) +that have lost their parent, and sends them `SIGTERM`. A still-living +parent means the worker is part of a real, in-progress run, and the +sweeper leaves it alone. + +## What's swept + +| Pattern | Source | +| --- | --- | +| `vitest/dist/workers/(forks\|threads)` | Vitest worker pool | +| `vitest/dist/(cli\|node).[mc]?js` | Orphaned Vitest parent runners | +| `\btsgo\b` | TypeScript Go-based type checker | +| `type-coverage/bin/type-coverage` | Type coverage tool | +| `esbuild/(bin\|lib)/.*\bservice\b` | esbuild's daemon service | + +## What's not swept + +- Anything spawned by a still-living shell (PPID alive) +- The Claude Code process itself or its parent terminal +- Anything outside the pattern list + +## Wiring + +In `.claude/settings.json`: + +```json +{ + "hooks": { + "Stop": [ + { + "hooks": [ + { + "type": "command", + "command": "node .claude/hooks/stale-process-sweeper/index.mts" + } + ] + } + ] + } +} +``` + +## Output + +Silent on the happy path (no orphans found). When something is reaped: + +``` +[stale-process-sweeper] reaped 14 stale worker(s), ~1120MB freed: +vitest-worker=29240(95MB), vitest-worker=33278(93MB), … +``` + +The line goes to stderr. Stop-hook output is shown to the user, not +the model — useful diagnostic, doesn't pollute Claude's context. + +## Tests + +```bash +cd .claude/hooks/stale-process-sweeper +node --test test/*.test.mts +``` diff --git a/.claude/hooks/stale-process-sweeper/index.mts b/.claude/hooks/stale-process-sweeper/index.mts new file mode 100644 index 000000000..4e9923e50 --- /dev/null +++ b/.claude/hooks/stale-process-sweeper/index.mts @@ -0,0 +1,214 @@ +#!/usr/bin/env node +// Claude Code Stop hook — stale-process-sweeper. +// +// Fires at turn-end. Finds Node test/build worker processes that the +// session left behind (test runner crashed mid-run, hook timed out, +// user interrupted `Bash`, etc.) and kills them so they don't pile up +// across turns and exhaust system memory. +// +// What's swept: +// - vitest workers (`vitest/dist/workers/forks` and the threads pool) +// - vitest itself (orphan parent runners that survived a SIGINT) +// - tsgo / tsc type-check daemons +// - type-coverage workers +// - esbuild service processes +// +// What's NOT swept: +// - Anything spawned by a still-living shell (PPID alive) +// - Anything matching the user's editors / IDEs / terminals +// - The Claude Code process itself +// +// The hook is fast (one `ps` call + a few regex matches + a couple of +// `kill -0` probes) and silent on the happy path. It only writes to +// stderr when it actually killed something — that's a useful signal. +// +// Stop hooks receive JSON on stdin (we don't read it; the body +// shape is irrelevant to our work) and exit code is advisory. + +import { spawnSync } from 'node:child_process' +import process from 'node:process' + +// Process-name patterns that indicate a stale test/build worker. +// Must be specific enough that real user processes (a normal `node` +// invocation, an editor's language server) don't match. +const STALE_PATTERNS: Array<{ name: string; rx: RegExp }> = [ + // Vitest worker pools — both `forks` (process-per-worker) and the + // path the threads pool uses when isolation is requested. The + // canonical leak: Vitest spawns N workers, parent crashes/SIGINTs, + // workers stay alive holding 80–100MB each. + { + name: 'vitest-worker', + rx: /vitest\/dist\/workers\/(forks|threads)/, + }, + // Vitest parent runner that survived its own children's exit. + // Matches `node ... vitest/dist/cli ... run` etc. + { + name: 'vitest-runner', + rx: /vitest\/dist\/(cli|node)\.[mc]?js/, + }, + // tsgo / tsc daemons. `tsgo` is the new Go-based type checker; + // `tsc --watch` daemons can also linger. + { + name: 'tsgo', + rx: /\btsgo\b/, + }, + // type-coverage runs as a separate process and sometimes outlives + // its CI step. + { + name: 'type-coverage', + rx: /type-coverage\/bin\/type-coverage/, + }, + // esbuild's daemon service helper. + { + name: 'esbuild-service', + rx: /esbuild\/(bin|lib)\/.*\bservice\b/, + }, +] + +interface ProcRow { + pid: number + ppid: number + rss: number + command: string +} + +function listProcesses(): ProcRow[] { + // -A: all processes, -o: custom format, no truncation. macOS + Linux + // both support this exact form. Windows isn't supported (Stop hook + // is unix-only in practice for socket-* repos). + const result = spawnSync( + 'ps', + ['-A', '-o', 'pid=,ppid=,rss=,command='], + { encoding: 'utf8' }, + ) + if (result.status !== 0 || !result.stdout) { + return [] + } + const rows: ProcRow[] = [] + for (const line of result.stdout.split('\n')) { + if (!line.trim()) { + continue + } + // Split into [pid, ppid, rss, ...command]. `command` may contain + // arbitrary spaces, so re-join after the first three fields. + const parts = line.trim().split(/\s+/) + if (parts.length < 4) { + continue + } + const pid = Number.parseInt(parts[0]!, 10) + const ppid = Number.parseInt(parts[1]!, 10) + const rss = Number.parseInt(parts[2]!, 10) + if (!Number.isFinite(pid) || !Number.isFinite(ppid)) { + continue + } + const command = parts.slice(3).join(' ') + rows.push({ pid, ppid, rss, command }) + } + return rows +} + +function isAlive(pid: number): boolean { + if (pid <= 1) { + // PID 0 / 1 are the kernel / init — if our parent is one of those, + // we're definitely an orphan, but `kill -0 1` would mislead. + return false + } + try { + process.kill(pid, 0) + return true + } catch { + return false + } +} + +function classify(row: ProcRow): string | undefined { + for (const { name, rx } of STALE_PATTERNS) { + if (rx.test(row.command)) { + return name + } + } + return undefined +} + +function sweep(): { killed: Array<{ pid: number; name: string; rssMb: number }>; skipped: number } { + const rows = listProcesses() + const myPid = process.pid + const myPpid = process.ppid + const killed: Array<{ pid: number; name: string; rssMb: number }> = [] + let skipped = 0 + + for (const row of rows) { + // Never touch ourselves or our parent (Claude Code). + if (row.pid === myPid || row.pid === myPpid) { + continue + } + const name = classify(row) + if (!name) { + continue + } + // Only sweep if the parent is gone (true orphan) or is PID 1 + // (re-parented to init after the original parent exited). A live + // parent means the worker is part of a real, in-progress run we + // should not interrupt. + const orphan = row.ppid === 1 || !isAlive(row.ppid) + if (!orphan) { + skipped += 1 + continue + } + try { + // SIGTERM first — give the worker a chance to flush. We don't + // wait for it; the next sweep (next turn) will SIGKILL anything + // that ignored SIGTERM. Keeping the hook fast matters more than + // squeezing every last byte. + process.kill(row.pid, 'SIGTERM') + killed.push({ + pid: row.pid, + name, + rssMb: Math.round(row.rss / 1024), + }) + } catch { + // Already gone, or we lack permission — nothing to do. + } + } + return { killed, skipped } +} + +function main() { + // Drain stdin (Stop hook delivers a JSON payload). We don't need + // the body, but Node will keep the event loop alive if we don't + // consume it. + process.stdin.resume() + process.stdin.on('data', () => {}) + process.stdin.on('end', runSweep) + // If stdin is already closed (some hook runners don't pipe input), + // run immediately. + if (process.stdin.readable === false) { + runSweep() + } +} + +function runSweep() { + let result: { killed: Array<{ pid: number; name: string; rssMb: number }>; skipped: number } + try { + result = sweep() + } catch (e) { + // Hooks must never crash a Claude turn. Log and exit clean. + process.stderr.write( + `[stale-process-sweeper] unexpected error: ${(e as Error).message}\n`, + ) + process.exit(0) + } + if (result.killed.length > 0) { + const totalMb = result.killed.reduce((sum, k) => sum + k.rssMb, 0) + const breakdown = result.killed + .map(k => `${k.name}=${k.pid}(${k.rssMb}MB)`) + .join(', ') + process.stderr.write( + `[stale-process-sweeper] reaped ${result.killed.length} stale ` + + `worker(s), ~${totalMb}MB freed: ${breakdown}\n`, + ) + } + process.exit(0) +} + +main() diff --git a/.claude/hooks/stale-process-sweeper/package.json b/.claude/hooks/stale-process-sweeper/package.json new file mode 100644 index 000000000..1a0f6de11 --- /dev/null +++ b/.claude/hooks/stale-process-sweeper/package.json @@ -0,0 +1,12 @@ +{ + "name": "hook-stale-process-sweeper", + "private": true, + "type": "module", + "main": "./index.mts", + "exports": { + ".": "./index.mts" + }, + "scripts": { + "test": "node --test test/*.test.mts" + } +} diff --git a/.claude/hooks/stale-process-sweeper/test/stale-process-sweeper.test.mts b/.claude/hooks/stale-process-sweeper/test/stale-process-sweeper.test.mts new file mode 100644 index 000000000..56ac35727 --- /dev/null +++ b/.claude/hooks/stale-process-sweeper/test/stale-process-sweeper.test.mts @@ -0,0 +1,84 @@ +import { spawn } from 'node:child_process' +import { fileURLToPath } from 'node:url' +import path from 'node:path' +import { test } from 'node:test' +import assert from 'node:assert/strict' + +const __dirname = path.dirname(fileURLToPath(import.meta.url)) +const HOOK = path.resolve(__dirname, '..', 'index.mts') + +// Run the hook with an empty stdin payload (Stop hook delivers JSON, +// but the body is unused). Captures stderr + exit code. +function runHook(): Promise<{ code: number; stderr: string }> { + return new Promise((resolve, reject) => { + const child = spawn(process.execPath, [HOOK], { + stdio: ['pipe', 'ignore', 'pipe'], + }) + let stderr = '' + child.stderr.on('data', d => { + stderr += d.toString() + }) + child.on('error', reject) + child.on('exit', code => { + resolve({ code: code ?? -1, stderr }) + }) + // Stop hooks receive a JSON payload on stdin. Send an empty object + // so the hook's drain logic completes. + child.stdin.end('{}\n') + }) +} + +test('stale-process-sweeper: exits 0 when nothing to sweep', async () => { + const { code, stderr } = await runHook() + assert.equal(code, 0, `hook should exit 0; stderr=${stderr}`) + // On a clean host the hook should be silent. + assert.equal( + stderr, + '', + `hook should be silent when no orphans exist; got: ${stderr}`, + ) +}) + +test('stale-process-sweeper: ignores live-parent test workers', async () => { + // Spawn a fake "vitest worker" whose parent is still alive. The + // sweeper must not touch it. We use a script path that matches the + // worker regex; the actual command runs `node -e 'setTimeout(...)'` + // long enough to outlive the hook invocation. + // + // Note: matching the regex `vitest/dist/workers/forks` requires a + // command line that contains that substring. We can't easily forge + // a real vitest binary, so we approximate by passing the path as an + // argv string — `ps -o command=` reflects argv, and the regex sees + // it. + const fakeWorker = spawn( + process.execPath, + [ + '-e', + 'setTimeout(() => {}, 5000)', + // This dummy arg is what `ps` will report; the sweeper's regex + // picks it up. The worker still has a live parent (this test + // process), so the sweeper should NOT kill it. + '/fake/vitest/dist/workers/forks.js', + ], + { stdio: 'ignore', detached: false }, + ) + // Give the OS a moment to register the child. + await new Promise(r => setTimeout(r, 100)) + try { + const { code, stderr } = await runHook() + assert.equal(code, 0) + // Should NOT have reaped the fake worker — its parent (us) is + // alive. If the hook killed it, the message would mention it. + assert.ok( + !stderr.includes('reaped'), + `hook reaped a live-parent worker: ${stderr}`, + ) + // Verify the worker is still alive. + assert.ok( + !fakeWorker.killed && fakeWorker.exitCode === null, + 'fake worker should still be running', + ) + } finally { + fakeWorker.kill('SIGKILL') + } +}) diff --git a/.claude/hooks/stale-process-sweeper/tsconfig.json b/.claude/hooks/stale-process-sweeper/tsconfig.json new file mode 100644 index 000000000..53c5c8475 --- /dev/null +++ b/.claude/hooks/stale-process-sweeper/tsconfig.json @@ -0,0 +1,15 @@ +{ + "compilerOptions": { + "declarationMap": false, + "erasableSyntaxOnly": true, + "module": "nodenext", + "moduleResolution": "nodenext", + "noEmit": true, + "rewriteRelativeImportExtensions": true, + "skipLibCheck": true, + "sourceMap": false, + "strict": true, + "target": "esnext", + "verbatimModuleSyntax": true + } +} diff --git a/.claude/hooks/token-hygiene/README.md b/.claude/hooks/token-guard/README.md similarity index 92% rename from .claude/hooks/token-hygiene/README.md rename to .claude/hooks/token-guard/README.md index ad3b89c21..3b1ae32af 100644 --- a/.claude/hooks/token-hygiene/README.md +++ b/.claude/hooks/token-guard/README.md @@ -1,6 +1,6 @@ -# token-hygiene +# token-guard -Claude Code `PreToolUse` hook that refuses Bash tool calls that would leak secrets to tool output. Mandatory across the Socket fleet — every repo ships this file byte-for-byte via `scripts/sync-scaffolding.mjs`. +Claude Code `PreToolUse` hook that refuses Bash tool calls that would leak secrets to tool output. Mandatory across the Socket fleet — every repo ships this file byte-for-byte via `scripts/sync-scaffolding.mts`. ## What it blocks @@ -41,17 +41,17 @@ The hook reads the tool-use payload from stdin, type-checks `tool_name === 'Bash ## Testing ```bash -pnpm --filter @socketsecurity/hook-token-hygiene test +pnpm --filter hook-token-guard test ``` -Adding new token-shape detections: update `LITERAL_TOKEN_PATTERNS` in `index.mts`, add a positive and negative test in `test/token-hygiene.test.mts`. +Adding new token-shape detections: update `LITERAL_TOKEN_PATTERNS` in `index.mts`, add a positive and negative test in `test/token-guard.test.mts`. ## Updating across the fleet -This file is in `IDENTICAL_FILES` in `scripts/sync-scaffolding.mjs`. After editing, run from `socket-repo-template`: +This file is in `IDENTICAL_FILES` in `scripts/sync-scaffolding.mts`. After editing, run from `socket-repo-template`: ```bash -node scripts/sync-scaffolding.mjs --all --fix +node scripts/sync-scaffolding.mts --all --fix ``` to propagate the change to every fleet repo. diff --git a/.claude/hooks/token-hygiene/index.mts b/.claude/hooks/token-guard/index.mts similarity index 83% rename from .claude/hooks/token-hygiene/index.mts rename to .claude/hooks/token-guard/index.mts index f9260547f..8ea2dba2a 100644 --- a/.claude/hooks/token-hygiene/index.mts +++ b/.claude/hooks/token-guard/index.mts @@ -1,5 +1,5 @@ #!/usr/bin/env node -// Claude Code PreToolUse hook — token-hygiene firewall. +// Claude Code PreToolUse hook — token-guard firewall. // // Blocks Bash commands that would echo token-bearing env vars into // tool output. This fires BEFORE the command runs; exit code 2 makes @@ -41,9 +41,15 @@ const SENSITIVE_ENV_NAMES = [ ] // Pipelines that "launder" earlier-stage secrets into safe output. +// The first two patterns match `sed 's/.../redact.../'` and +// `sed 's/.../FOO=*****/'` regardless of which delimiter sed uses +// (`/`, `#`, `|`). `[\s\S]*?` reaches across the delimiter between +// the search and replacement parts (the previous `[^/|#]*` couldn't +// cross `/` and so missed the canonical `sed 's/=.*/=/'` +// — the very command the token-guard error message suggests). const REDACTION_MARKERS = [ - /\bsed\b[^|]*s[/|#][^/|#]*=[^/|#]*\s*\/dev\/null/, @@ -120,9 +126,19 @@ type ToolInput = { const hasRedaction = (command: string): boolean => REDACTION_MARKERS.some(re => re.test(command)) +// Word-boundary match so `PASS` doesn't fire on `PATHS-ALLOWLIST` and +// `AUTH` doesn't fire on `AUTHOR`. Env-var-style boundaries treat `_` +// as a separator (so `ACCESS_TOKEN` matches `TOKEN`) but require a +// non-alphanumeric character on each end (so `PATHS` doesn't match +// `PASS`). The pre-fix substring match created false positives +// whenever a path name happened to contain a sensitive keyword as a +// literal substring. +const sensitiveEnvBoundaryRes = SENSITIVE_ENV_NAMES.map( + frag => new RegExp(String.raw`(?:^|[^A-Z0-9])${frag}(?:[^A-Z0-9]|$)`), +) const referencesSensitiveEnv = (command: string): boolean => { const upper = command.toUpperCase() - return SENSITIVE_ENV_NAMES.some(frag => upper.includes(frag)) + return sensitiveEnvBoundaryRes.some(re => re.test(upper)) } const matchesAlwaysDangerous = (command: string): RegExp | null => { @@ -148,9 +164,11 @@ const check = (command: string): void => { } } - // 1. Always-dangerous patterns. + // 1. Always-dangerous patterns. Skip when the command already has a + // redaction pipeline — the suggested fix here is `env | sed ...`, + // which would itself match ALWAYS_DANGEROUS without this guard. const dangerous = matchesAlwaysDangerous(command) - if (dangerous) { + if (dangerous && !hasRedaction(command)) { throw new BlockError( `\`${dangerous.source}\` dumps env to stdout`, 'Pipe through redaction, e.g. `env | sed "s/=.*/=/"` or filter specific keys.', @@ -204,7 +222,7 @@ const emitBlock = (command: string, err: BlockError): void => { ? command.slice(0, 200) + (command.length > 200 ? '…' : '') : '' process.stderr.write( - `\n[token-hygiene] Blocked: ${err.rule}\n` + + `\n[token-guard] Blocked: ${err.rule}\n` + ` Command: ${safeCommand}\n` + ` Fix: ${err.suggestion}\n\n`, ) @@ -244,6 +262,6 @@ const main = async (): Promise => { main().catch(e => { // Never block a tool call due to a bug in the hook itself. Log it // so we notice, but fail open. - process.stderr.write(`[token-hygiene] hook error (allowing): ${e}\n`) + process.stderr.write(`[token-guard] hook error (allowing): ${e}\n`) process.exitCode = 0 }) diff --git a/.claude/hooks/token-guard/package.json b/.claude/hooks/token-guard/package.json new file mode 100644 index 000000000..fc68951d8 --- /dev/null +++ b/.claude/hooks/token-guard/package.json @@ -0,0 +1,12 @@ +{ + "name": "hook-token-guard", + "private": true, + "type": "module", + "main": "./index.mts", + "exports": { + ".": "./index.mts" + }, + "scripts": { + "test": "node --test test/*.test.mts" + } +} diff --git a/.claude/hooks/token-hygiene/test/token-hygiene.test.mts b/.claude/hooks/token-guard/test/token-guard.test.mts similarity index 85% rename from .claude/hooks/token-hygiene/test/token-hygiene.test.mts rename to .claude/hooks/token-guard/test/token-guard.test.mts index 27ea25fec..b2ab67147 100644 --- a/.claude/hooks/token-hygiene/test/token-hygiene.test.mts +++ b/.claude/hooks/token-guard/test/token-guard.test.mts @@ -1,5 +1,5 @@ /** - * @fileoverview Tests for the token-hygiene hook. + * @fileoverview Tests for the token-guard hook. * * Runs the hook as a subprocess (node --test), piping a tool-use * payload on stdin and asserting on the exit code + stderr. Exit 2 @@ -40,7 +40,7 @@ function runHook(command: string, toolName = 'Bash'): { } } -describe('token-hygiene hook', () => { +describe('token-guard hook', () => { describe('allows safe commands', () => { it('plain echo', () => { assert.equal(runHook('echo hello').code, 0) @@ -182,6 +182,25 @@ describe('token-hygiene hook', () => { }) }) + describe('does not false-positive on substring of sensitive name', () => { + // Regression: `PATHS-ALLOWLIST.YML` toUpperCase()d contains `PASS` + // as a substring, which the pre-fix unbounded match treated as + // a sensitive env reference. Word-boundary fix means `PASS` must + // be a standalone token (or at a `_`/`-`/`.`/`/` boundary). + it('paths-allowlist.yml does not trip PASS', () => { + assert.equal(runHook('cat .github/paths-allowlist.yml').code, 0) + }) + it('AUTHOR_NAME does not trip AUTH', () => { + // AUTHOR ends with R; the boundary-after match correctly skips + // it because the next char is `_`, but `AUTH` followed by `O` + // (alphanumeric) is not a token boundary. + assert.equal(runHook('echo $AUTHOR_NAME').code, 0) + }) + it('PASSAGE_TIME does not trip PASS', () => { + assert.equal(runHook('echo $PASSAGE_TIME').code, 0) + }) + }) + describe('fails open on malformed input', () => { it('empty stdin', () => { const r = spawnSync(nodeBin, [hookScript], { diff --git a/.claude/hooks/token-guard/tsconfig.json b/.claude/hooks/token-guard/tsconfig.json new file mode 100644 index 000000000..53c5c8475 --- /dev/null +++ b/.claude/hooks/token-guard/tsconfig.json @@ -0,0 +1,15 @@ +{ + "compilerOptions": { + "declarationMap": false, + "erasableSyntaxOnly": true, + "module": "nodenext", + "moduleResolution": "nodenext", + "noEmit": true, + "rewriteRelativeImportExtensions": true, + "skipLibCheck": true, + "sourceMap": false, + "strict": true, + "target": "esnext", + "verbatimModuleSyntax": true + } +} diff --git a/.claude/settings.json b/.claude/settings.json index 3490c309f..cca698752 100644 --- a/.claude/settings.json +++ b/.claude/settings.json @@ -7,22 +7,65 @@ { "type": "command", "command": "node .claude/hooks/check-new-deps/index.mts" + }, + { + "type": "command", + "command": "node .claude/hooks/logger-guard/index.mts" + }, + { + "type": "command", + "command": "node .claude/hooks/path-guard/index.mts" } ] }, { "matcher": "Bash", "hooks": [ + { + "type": "command", + "command": "node .claude/hooks/private-name-guard/index.mts" + }, { "type": "command", "command": "node .claude/hooks/public-surface-reminder/index.mts" }, { "type": "command", - "command": "node .claude/hooks/token-hygiene/index.mts" + "command": "node .claude/hooks/release-workflow-guard/index.mts" + }, + { + "type": "command", + "command": "node .claude/hooks/token-guard/index.mts" } ] } + ], + "Stop": [ + { + "hooks": [ + { + "type": "command", + "command": "node .claude/hooks/auth-rotation-reminder/index.mts" + }, + { + "type": "command", + "command": "node .claude/hooks/stale-process-sweeper/index.mts" + } + ] + } + ] + }, + "permissions": { + "deny": [ + "Bash(gh release create:*)", + "Bash(gh release delete:*)", + "Bash(gh workflow dispatch:*)", + "Bash(gh workflow run:*)", + "Bash(git push --force:*)", + "Bash(git push -f:*)", + "Bash(npm publish:*)", + "Bash(pnpm publish:*)", + "Bash(yarn publish:*)" ] } } diff --git a/.claude/skills/_shared/path-guard-rule.md b/.claude/skills/_shared/path-guard-rule.md new file mode 100644 index 000000000..2447f8b71 --- /dev/null +++ b/.claude/skills/_shared/path-guard-rule.md @@ -0,0 +1,39 @@ + + +## 1 path, 1 reference + +**A path is *constructed* exactly once. Everywhere else *references* the constructed value.** + +Referencing a single computed path many times is fine — that's the whole point of computing it once. What's banned is *re-constructing* the same path in multiple places, because that's where drift is born. Three concrete shapes: + +1. **Within a package** — every script, test, and lib file that needs a build path imports it from the package's `scripts/paths.mts` (or `lib/paths.mts`). No `path.join('build', mode, ...)` outside that module. + +2. **Across packages** — when package B consumes package A's output, B imports A's `paths.mts` via the workspace `exports` field. Never `path.join(PKG, '..', '', 'build', ...)`. The R28 yoga/ink bug — ink hand-building yoga's wasm path and missing the `wasm/` segment — is the canonical failure mode this rule prevents. + +3. **Workflows, Dockerfiles, shell scripts** — they can't `import` TS, so they construct the string once and reference it everywhere downstream. Workflows: a "Compute paths" step exposes `steps.paths.outputs.final_dir`; later steps read `${{ steps.paths.outputs.final_dir }}`. Dockerfiles/shell: assign once to a variable, reference by name thereafter. Each canonical construction carries a comment naming the source-of-truth `paths.mts` so the YAML can't drift from TS without a flagged change. **Re-building** the same path in a second step is the violation, not referring to the constructed value many times. + +Comments that re-state a full path are forbidden. The import statement IS the comment. Docs and READMEs may describe the structure ("output goes under the Final dir") but should not encode a complete `build///out/Final/binary` string — encoded paths get parsed by tools and silently rot. + +Code execution takes priority over docs: violations in `.mts`/`.cts`, Makefiles, Dockerfiles, workflow YAML, and shell scripts are blocking. README and doc-comment violations are advisory unless they contain a fully-qualified path with no parametric placeholders. + +### Three-level enforcement + +- **Hook** — `.claude/hooks/path-guard/` blocks `Edit`/`Write` calls that would introduce a violation in a `.mts`/`.cts` file. Refusal at edit time stops new duplication from landing. +- **Gate** — `scripts/check-paths.mts` runs in `pnpm check`. Fails the build on any violation that isn't allowlisted. +- **Skill** — `/path-guard` audits the repo and fixes findings; `/path-guard check` reports only; `/path-guard install` drops the gate + hook + rule into a fresh repo. + +The mantra is intentionally short so it sticks: **1 path, 1 reference**. When in doubt, find the canonical owner and import from it. diff --git a/.claude/skills/path-guard/SKILL.md b/.claude/skills/path-guard/SKILL.md new file mode 100644 index 000000000..8ff21c2b8 --- /dev/null +++ b/.claude/skills/path-guard/SKILL.md @@ -0,0 +1,250 @@ +--- +name: path-guard +description: Audit and fix path duplication in this Socket repo. Apply the strict "1 path, 1 reference" rule — every build/test/runtime/config path is constructed exactly once; everywhere else references the constructed value. Default mode finds and fixes; `check` mode reports only; `install` mode drops the gate + hook + rule into a fresh repo. +user-invocable: true +allowed-tools: Task, Read, Edit, Write, Grep, Glob, AskUserQuestion, Bash(pnpm run check:*), Bash(node scripts/check-paths:*), Bash(rg:*), Bash(grep:*), Bash(find:*), Bash(git:*) +--- + +# path-guard + +**Mantra: 1 path, 1 reference.** A path is constructed exactly once; everywhere else references the constructed value. Re-constructing the same path twice is the violation, not referencing the constructed value many times. + +## Modes + +- `/path-guard` — full audit-and-fix conversion of the current repo (default). +- `/path-guard check` — read-only audit, report violations, no fixes. +- `/path-guard fix ` — fix a single finding from a prior `check` run, by index. +- `/path-guard install` — drop the gate + hook + rule + allowlist into a fresh repo (for new Socket repos). + +## Three-level enforcement + +The strategy lives in three artifacts that ship together: + +1. **CLAUDE.md rule** — the mantra and detection rules in plain language. Every Socket repo's CLAUDE.md carries `## 1 path, 1 reference`. Synced from `.claude/skills/_shared/path-guard-rule.md`. +2. **Hook** — `.claude/hooks/path-guard/index.mts` runs `PreToolUse` on `Edit`/`Write` of `.mts`/`.cts` files. Blocks new violations at edit time. Mandatory across the fleet. +3. **Gate** — `scripts/check-paths.mts` runs in `pnpm check` (and CI). Whole-repo scan. Fails the build on any unsanctioned violation. + +The hook and gate share their stage / build-root / mode / sibling-package vocabulary via `.claude/hooks/path-guard/segments.mts` — a single canonical source. Adding a new stage segment or fleet package means editing one file; the two consumers can never drift on what counts as a build-output path. + +This skill is the *audit-and-fix workflow* that makes a repo conform initially and validates conformance over time. + +## Detection rules + +The gate enforces six rules. The hook enforces a subset (A and B) since it sees only one diff at a time. + +| Rule | What it catches | Where checked | +|---|---|---| +| **A** | Multi-stage `path.join(...)` constructed inline. Two or more "stage" segments (Final, Release, Stripped, Compressed, Optimized, Synced, wasm, downloaded), or one stage + build-root + mode. | `.mts`/`.cts` files outside a `paths.mts`. Hook + gate. | +| **B** | Cross-package traversal: `path.join(*, '..', '', 'build', ...)` reaching into a sibling's output instead of importing via `exports`. | `.mts`/`.cts` files. Hook + gate. | +| **C** | Workflow YAML constructs the same path string in 2+ steps outside a "Compute paths" step. | `.github/workflows/*.yml`. Gate. | +| **D** | Comment encodes a fully-qualified multi-stage path string (e.g. `# build/dev/darwin-arm64/out/Final/binary`). | `.github/workflows/*.yml`. Gate. | +| **F** | Same path shape constructed in 2+ different files. | All scanned files. Gate. | +| **G** | Hand-built multi-stage path constructed 2+ times in the same Makefile/Dockerfile/shell stage. | `Makefile`, `*.mk`, `*.Dockerfile`, `Dockerfile.*`, `*.sh`. Gate. | + +Comments may describe path *structure* with placeholders (`/` or `${BUILD_MODE}/${PLATFORM_ARCH}`) but should not encode a complete literal path string. Code execution takes priority over docs: violations in `.mts`, Makefiles, Dockerfiles, workflow YAML, shell scripts are blocking. + +## Mode: audit-and-fix (default) + +When invoked as `/path-guard` with no arg: + +1. **Setup** — spawn a worktree off `main` per `CLAUDE.md` parallel-sessions rule: + ```bash + git worktree add -b paths-audit ../-paths-audit main + cd ../-paths-audit + ``` + +2. **Audit** — run the gate to enumerate findings: + ```bash + pnpm run check:paths --json > /tmp/paths-findings.json + pnpm run check:paths --explain # human-readable + ``` + +3. **Fix loop** — for each finding, apply the matching pattern below. After each fix, re-run the gate. Stop iterating when `pnpm run check:paths` exits 0. + +4. **Verify** — run the full check suite + zizmor on any modified workflow: + ```bash + pnpm check + for w in .github/workflows/*.yml; do zizmor "$w"; done + ``` + +5. **Commit and push** — group fixes by logical category (workflows, code, Dockerfiles). Push directly to `main` for repos that allow direct push, or open a PR for repos that require it (socket-cli, socket-sdk-js, socket-registry per their CLAUDE.md / memory entries). + +## Fix patterns + +### Rule A — Multi-stage path constructed inline (in `.mts`/`.cts`) + +**Bad**: +```ts +const finalBinary = path.join(PACKAGE_ROOT, 'build', BUILD_MODE, PLATFORM_ARCH, 'out', 'Final', 'binary') +``` + +**Fix**: move the construction into the package's `scripts/paths.mts` (or `lib/paths.mts`), or use a build-infra helper: +```ts +// In packages/foo/scripts/paths.mts: +export function getBuildPaths(mode, platformArch) { + // ... constructs once ... + return { outputFinalBinary: path.join(PACKAGE_ROOT, 'build', mode, platformArch, 'out', 'Final', binaryName) } +} + +// In the consumer: +import { getBuildPaths } from './paths.mts' +const { outputFinalBinary } = getBuildPaths(mode, platformArch) +``` + +For binsuite tools (binpress/binflate/binject) the canonical helper is `getFinalBinaryPath(packageRoot, mode, platformArch, binaryName)` from `build-infra/lib/paths`. For download caches use `getDownloadedDir(packageRoot)`. + +### Rule B — Cross-package traversal + +**Bad**: +```ts +const liefDir = path.join(PACKAGE_ROOT, '..', 'lief-builder', 'build', mode, platformArch, 'out', 'Final', 'lief') +``` + +**Fix**: declare the workspace dep, expose `paths.mts` via the producer's `exports`, import the helper: + +1. In producer's `package.json`: + ```json + "exports": { + "./scripts/paths": "./scripts/paths.mts" + } + ``` +2. In consumer's `package.json` `dependencies`: + ```json + "lief-builder": "workspace:*" + ``` +3. In consumer: + ```ts + import { getBuildPaths as getLiefBuildPaths } from 'lief-builder/scripts/paths' + const { outputFinalDir } = getLiefBuildPaths(mode, platformArch) + ``` + +### Rule C — Workflow path repetition + +**Bad** (3 steps each rebuilding the same path): +```yaml +- name: Step A + run: cd packages/foo/build/${BUILD_MODE}/${PLATFORM_ARCH}/out/Final && do-thing-1 +- name: Step B + run: cd packages/foo/build/${BUILD_MODE}/${PLATFORM_ARCH}/out/Final && do-thing-2 +- name: Step C + run: cd packages/foo/build/${BUILD_MODE}/${PLATFORM_ARCH}/out/Final && do-thing-3 +``` + +**Fix**: add a "Compute paths" step early in the job that constructs the path once, expose via `$GITHUB_OUTPUT`, reference downstream: + +```yaml +- name: Compute foo paths + id: paths + env: + BUILD_MODE: ${{ steps.build-mode.outputs.mode }} + PLATFORM_ARCH: ${{ steps.platform-arch.outputs.platform_arch }} + run: | + PACKAGE_DIR="packages/foo" + PLATFORM_BUILD_DIR="${PACKAGE_DIR}/build/${BUILD_MODE}/${PLATFORM_ARCH}" + FINAL_DIR="${PLATFORM_BUILD_DIR}/out/Final" + { + echo "package_dir=${PACKAGE_DIR}" + echo "platform_build_dir=${PLATFORM_BUILD_DIR}" + echo "final_dir=${FINAL_DIR}" + } >> "$GITHUB_OUTPUT" + +- name: Step A + env: + FINAL_DIR: ${{ steps.paths.outputs.final_dir }} + run: cd "$FINAL_DIR" && do-thing-1 +# ... etc +``` + +For paths used inside `working-directory: packages/foo` steps, expose a `_rel` companion (e.g. `final_dir_rel=build/${BUILD_MODE}/${PLATFORM_ARCH}/out/Final`) and reference that. + +### Rule D — Comment-encoded paths + +**Bad**: +```yaml +# Path: packages/foo/build/dev/darwin-arm64/out/Final/binary +COPY --from=builder /build/.../out/Final/binary /out/Final/binary +``` + +**Fix**: cite the canonical `paths.mts` instead of duplicating the string: +```yaml +# Layout owned by packages/foo/scripts/paths.mts:getBuildPaths(). +COPY --from=builder /build/packages/foo/build/${BUILD_MODE}/${PLATFORM_ARCH}/out/Final/binary /out/Final/binary +``` + +The comment may describe structure (`/`) but should not be a parsable literal path. + +### Rule G — Dockerfile/Makefile/shell duplicate construction + +**Bad** (Dockerfile reconstructs the path 3 times in the same stage): +```dockerfile +RUN mkdir -p build/${BUILD_MODE}/${PLATFORM_ARCH}/out/Final && \ + cp src build/${BUILD_MODE}/${PLATFORM_ARCH}/out/Final/output && \ + ls build/${BUILD_MODE}/${PLATFORM_ARCH}/out/Final/ +``` + +**Fix**: declare an `ENV` once, reference everywhere: +```dockerfile +# Layout owned by packages/foo/scripts/paths.mts. +ENV FINAL_DIR=build/${BUILD_MODE}/${PLATFORM_ARCH}/out/Final +RUN mkdir -p "$FINAL_DIR" && cp src "$FINAL_DIR/output" && ls "$FINAL_DIR/" +``` + +Each Dockerfile `FROM` stage is its own scope — ENV from the build stage doesn't reach a subsequent `FROM scratch AS export` stage. The gate accounts for this. + +## Mode: check (read-only) + +When invoked as `/path-guard check`: + +```bash +pnpm run check:paths --explain +``` + +Print the gate's findings without making any edits. Exit 0 if clean, 1 if findings present. Useful for CI / pre-merge inspection. + +## Allowlisting a finding + +When a genuine exemption is needed (rare — most "false positives" should be reported as gate bugs), add an entry to `.github/paths-allowlist.yml`. Two ways to pin the entry to a specific site: + +- **`line:`** — exact line number. Strict; a single-line edit above shifts the entry off-target and the finding re-surfaces. +- **`snippet_hash:`** — 12-char SHA-256 prefix of the offending snippet (whitespace-normalized). Drift-resistant: survives reformatting, but any content-changing edit invalidates it. Get the hash: + ```bash + pnpm run check:paths --show-hashes + ``` + +Both may be set — either matching is sufficient. Prefer `snippet_hash` over raw `line:` when the exemption is expected to outlive routine reformatting; prefer `line:` when you specifically *want* the entry to fall off after any nearby edit. + +## Mode: install (new repo) + +When invoked as `/path-guard install` on a Socket repo that doesn't yet have the gate: + +1. Copy the gate file from this skill's reference dir: + ```bash + cp .claude/skills/path-guard/reference/check-paths.mts.tmpl scripts/check-paths.mts + ``` +2. Copy the empty allowlist: + ```bash + cp .claude/skills/path-guard/reference/paths-allowlist.yml.tmpl .github/paths-allowlist.yml + ``` +3. Add `"check:paths": "node scripts/check-paths.mts"` to `package.json`. +4. Wire `runPathHygieneCheck()` into `scripts/check.mts` (after the existing checks). +5. Append the rule snippet from `.claude/skills/_shared/path-guard-rule.md` to the repo's `CLAUDE.md` if a `1 path, 1 reference` section is missing. +6. Add the hook entry to `.claude/settings.json` `PreToolUse` matcher `Edit|Write`: + ```json + { "type": "command", "command": "node .claude/hooks/path-guard/index.mts" } + ``` +7. Run the gate against the repo. Triage findings as you would in audit-and-fix mode. + +## Tie-in with quality-scan + +The `/quality-scan` skill should call `pnpm run check:paths --json` as one of its sub-scans and surface findings as part of its A-F graded report. Failures roll into the overall quality grade. The full audit-and-fix workflow lives here; quality-scan just *detects* during periodic scans. + +## Reference patterns + +When converting a repo to the strategy, the patterns I keep reusing: + +- **TS-first packages**: each package owns a `scripts/paths.mts` with `PACKAGE_ROOT`, `BUILD_ROOT`, `getBuildPaths(mode, platformArch)` returning at minimum `outputFinalDir` and `outputFinalBinary`/`outputFinalFile`. +- **Cross-package consumers**: `package.json` `exports` whitelists `./scripts/paths`. Consumer adds `": workspace:*"` and imports. +- **Workflows**: each job has a "Compute paths" step (`id: paths`) early in the job. Step outputs include `package_dir`, `platform_build_dir`, `final_dir`, named files. `_rel` companions when `working-directory:` is used. +- **Docker stages**: each `FROM` stage declares `ENV PLATFORM_BUILD_DIR=...` and `ENV FINAL_DIR=...` once. Subsequent RUN steps reference the variables. + +The first repo (socket-btm) is the worked example. Read its `scripts/paths.mts` files and `.github/workflows/*.yml` for canonical patterns when applying the strategy elsewhere. diff --git a/.claude/skills/path-guard/reference/check-paths.mts.tmpl b/.claude/skills/path-guard/reference/check-paths.mts.tmpl new file mode 100644 index 000000000..cbecc71e5 --- /dev/null +++ b/.claude/skills/path-guard/reference/check-paths.mts.tmpl @@ -0,0 +1,947 @@ +#!/usr/bin/env node +/** + * @fileoverview Path-hygiene gate. + * + * Mantra: 1 path, 1 reference. A path is constructed exactly once; + * everywhere else references the constructed value. + * + * Whole-repo scan complementing the per-edit `.claude/hooks/path-guard` + * hook. The hook stops new violations from landing; this gate finds + * the existing ones and blocks merges that introduce more. + * + * Rules enforced: + * + * A — Multi-stage path constructed inline. A `path.join(...)` call + * (or template literal) in a `.mts`/`.cts` file outside a + * `paths.mts` that stitches together two or more "stage" + * segments (Final, Release, Stripped, Compressed, Optimized, + * Synced, wasm, downloaded), or one stage plus a build-root + * (`build`/`out`) plus a mode (`dev`/`prod`/`shared`). The + * construction belongs in the package's `paths.mts` (or a + * build-infra helper); every consumer imports the computed + * value. + * + * B — Cross-package path traversal. A `path.join(*, '..', '', 'build', ...)` reaches into a sibling's build + * output without going through its `exports`. The sibling owns + * its layout; consumers declare a workspace dep and import the + * sibling's `paths.mts`. + * + * C — Hand-built workflow path. A `.github/workflows/*.yml` step + * constructs `build/${...}/out//...` inline outside a + * canonical "Compute paths" step. Workflows can carry path + * strings, but the strings are constructed once and exposed via + * step outputs / job env that downstream steps reference. + * + * D — Comment-encoded paths. Comments (in code or YAML) that re-state + * a fully-qualified multi-stage path. Comments may describe the + * structure ("Final dir" or "build//...") but should not + * encode a complete path string that a tool would parse — the + * canonical construction IS the documentation. + * + * F — Same path constructed in multiple places. The same shape of + * multi-stage `path.join(...)` (or workflow `build/${...}/...` + * string template) appearing in two or more files. Construct + * once and import; references of the constructed value are + * unlimited. + * + * G — Hand-built paths in Makefiles, Dockerfiles, and shell scripts. + * Same shape as A, applied to executable artifacts that don't + * run TypeScript. Each canonical construction must carry a + * comment naming the source-of-truth `paths.mts` so the script + * can't drift from TS without a flagged change. + * + * Allowlist: `.github/paths-allowlist.yml`. Each entry needs a + * `reason` so the list stays audit-able. Patterns are deliberately + * narrow — entries should be specific, not blanket. + * + * Usage: + * node scripts/check-paths.mts # default: report + fail + * node scripts/check-paths.mts --explain # long-form explanation + * node scripts/check-paths.mts --json # machine-readable + * node scripts/check-paths.mts --quiet # silent on clean + * + * Exit codes: + * 0 — clean (no findings, or every finding is allowlisted) + * 1 — findings present + * 2 — gate itself crashed + */ + +import { createHash } from 'node:crypto' +import { existsSync, readFileSync, readdirSync } from 'node:fs' +import path from 'node:path' +import process from 'node:process' + +import { fileURLToPath } from 'node:url' + +import { parseArgs } from 'node:util' + +import { + BUILD_ROOT_SEGMENTS, + KNOWN_SIBLING_PACKAGES, + MODE_SEGMENTS, + STAGE_SEGMENTS, +} from '../.claude/hooks/path-guard/segments.mts' + +// Plain stderr/stdout output — no @socketsecurity/lib dependency so +// the gate is self-contained and works in socket-lib itself (which +// would otherwise import itself). +const logger = { + log: (msg: string) => process.stdout.write(msg + '\n'), + error: (msg: string) => process.stderr.write(msg + '\n'), + step: (msg: string) => process.stdout.write(`→ ${msg}\n`), + success: (msg: string) => process.stdout.write(`✔ ${msg}\n`), + substep: (msg: string) => process.stdout.write(` ${msg}\n`), +} + +const __filename = fileURLToPath(import.meta.url) +const __dirname = path.dirname(__filename) +const REPO_ROOT = path.resolve(__dirname, '..') + +// Stage / build-root / mode / sibling-package vocabularies are imported +// from `.claude/hooks/path-guard/segments.mts` (the canonical source). +// Both this gate and the path-guard hook share that single definition +// — Mantra: 1 path, 1 reference. + +// File-path patterns that legitimately enumerate path segments. +const EXEMPT_FILE_PATTERNS: RegExp[] = [ + // Any paths.mts is the canonical constructor. + /(^|\/)paths\.(mts|cts|js)$/, + // Build-infra owns shared helpers that enumerate stages. + /packages\/build-infra\/lib\/paths\.mts$/, + /packages\/build-infra\/lib\/constants\.mts$/, + // Path-scanning gates that intentionally enumerate. + /scripts\/check-paths\.mts$/, + /scripts\/check-consistency\.mts$/, + /\.claude\/hooks\/path-guard\//, + // Allowlist + config files. + /\.github\/paths-allowlist\.yml$/, +] + +type Finding = { + rule: 'A' | 'B' | 'C' | 'D' | 'F' | 'G' + file: string + line: number + snippet: string + message: string + fix: string +} + +const findings: Finding[] = [] + +const args = parseArgs({ + options: { + explain: { type: 'boolean', default: false }, + json: { type: 'boolean', default: false }, + quiet: { type: 'boolean', default: false }, + 'show-hashes': { type: 'boolean', default: false }, + }, + strict: false, +}) + +const isExempt = (filePath: string): boolean => + EXEMPT_FILE_PATTERNS.some(re => re.test(filePath)) + +// ────────────────────────────────────────────────────────────────── +// Allowlist loading +// ────────────────────────────────────────────────────────────────── + +type AllowlistEntry = { + file?: string + pattern?: string + rule?: string + line?: number + snippet_hash?: string + reason: string +} + +const loadAllowlist = (): AllowlistEntry[] => { + const allowlistPath = path.join(REPO_ROOT, '.github', 'paths-allowlist.yml') + if (!existsSync(allowlistPath)) { + return [] + } + const text = readFileSync(allowlistPath, 'utf8') + // Tiny YAML parser — only the shape we need: list of entries with + // `file`, `pattern`, `rule`, `line`, `reason` scalar fields, plus + // YAML 1.2 block-scalar indicators `|` (literal) and `>` (folded) + // for multi-line reasons. Avoids a yaml dep for a gate that has to + // be self-contained. + const entries: AllowlistEntry[] = [] + let current: Partial | null = null + // When set, subsequent more-indented lines fold into this key as a + // block scalar (literal '|' keeps newlines, folded '>' joins with + // spaces). + let blockKey: string | null = null + let blockKind: '|' | '>' | null = null + let blockIndent = 0 + let blockLines: string[] = [] + const flushBlock = () => { + if (current && blockKey) { + const value = + blockKind === '>' + ? blockLines.join(' ').replace(/\s+/g, ' ').trim() + : blockLines.join('\n').replace(/\n+$/, '') + ;(current as any)[blockKey] = value + } + blockKey = null + blockKind = null + blockLines = [] + } + const indentOf = (line: string): number => { + let i = 0 + while (i < line.length && line[i] === ' ') { + i += 1 + } + return i + } + const lines = text.split('\n') + for (let i = 0; i < lines.length; i++) { + const raw = lines[i]! + const line = raw.replace(/\r$/, '') + // Block-scalar accumulation takes precedence over normal parsing. + if (blockKey !== null) { + if (line.trim() === '') { + // Preserve blank lines inside a literal block; folded blocks + // turn them into paragraph breaks (kept as separate joins). + blockLines.push('') + continue + } + const indent = indentOf(line) + if (indent >= blockIndent) { + blockLines.push(line.slice(blockIndent)) + continue + } + flushBlock() + // Fall through and re-process the dedented line as normal. + } + if (!line.trim() || line.trim().startsWith('#')) { + continue + } + const tryAssign = (key: string, value: string) => { + const trimmed = value.trim() + if (current === null) { + return + } + if (trimmed === '|' || trimmed === '>') { + blockKey = key + blockKind = trimmed as '|' | '>' + blockIndent = indentOf(lines[i + 1] ?? '') || indentOf(line) + 2 + blockLines = [] + return + } + ;(current as any)[key] = + key === 'line' ? Number(unquote(trimmed)) : unquote(trimmed) + } + if (line.startsWith('- ')) { + if (current && current.reason) { + entries.push(current as AllowlistEntry) + } + current = {} + const rest = line.slice(2).trim() + if (rest) { + const m = rest.match(/^([\w-]+):\s*(.*)$/) + if (m) { + tryAssign(m[1]!, m[2]!) + } + } + } else if (current) { + const m = line.match(/^\s+([\w-]+):\s*(.*)$/) + if (m) { + tryAssign(m[1]!, m[2]!) + } + } + } + if (blockKey !== null) { + flushBlock() + } + if (current && current.reason) { + entries.push(current as AllowlistEntry) + } + return entries +} + +const unquote = (s: string): string => { + const t = s.trim() + if ( + (t.startsWith('"') && t.endsWith('"')) || + (t.startsWith("'") && t.endsWith("'")) + ) { + return t.slice(1, -1) + } + return t +} + +const ALLOWLIST = loadAllowlist() + +/** + * Stable, normalized snippet hash. Whitespace-insensitive so trivial + * reformatting (indent change, trailing comma, line wrap) doesn't + * invalidate an allowlist entry, but content-changing edits do. The + * hash exposes only the first 12 hex chars (~48 bits) which is plenty + * for collision-resistance within a single repo's finding set and + * keeps the YAML readable. + */ +const snippetHash = (snippet: string): string => { + const normalized = snippet.replace(/\s+/g, ' ').trim() + return createHash('sha256').update(normalized).digest('hex').slice(0, 12) +} + +/** + * Allowlist matching trades off two failure modes: + * + * - Drift via reformatting (a line shift breaks an entry, the + * finding re-surfaces, devs paper over with a new entry). + * - Stealth allowlisting (an entry pinned to "anywhere in this file" + * silently exempts unrelated future violations). + * + * Strategy: exact line match OR `snippet_hash` match (whitespace- + * normalized SHA-256, first 12 hex). Either is sufficient. Lines stay + * exact (was ±2; the slack let reformatting silently slide), and + * `snippet_hash` provides reformatting-tolerant matching that's still + * tied to the literal text — paste-and-edit cheating would change the + * hash. If neither `line` nor `snippet_hash` is provided, the entry + * matches purely by `rule` + `file` + `pattern` (file-level exempt; + * use sparingly and always pair with a precise `pattern`). + */ +const isAllowlisted = (finding: Finding): boolean => + ALLOWLIST.some(entry => { + if (entry.rule && entry.rule !== finding.rule) { + return false + } + if (entry.file && !finding.file.includes(entry.file)) { + return false + } + if (entry.pattern && !finding.snippet.includes(entry.pattern)) { + return false + } + const lineProvided = entry.line !== undefined + const hashProvided = + typeof entry.snippet_hash === 'string' && entry.snippet_hash.length > 0 + if (lineProvided || hashProvided) { + const lineMatches = lineProvided && entry.line === finding.line + const hashMatches = + hashProvided && entry.snippet_hash === snippetHash(finding.snippet) + if (!(lineMatches || hashMatches)) { + return false + } + } + return true + }) + +// ────────────────────────────────────────────────────────────────── +// File walking +// ────────────────────────────────────────────────────────────────── + +const SKIP_DIRS = new Set([ + '.git', + 'node_modules', + 'build', + 'dist', + 'out', + 'target', + '.cache', + 'upstream', +]) + +const walk = function* ( + dir: string, + filter: (relPath: string) => boolean, +): Generator { + let entries + try { + entries = readdirSync(dir, { withFileTypes: true }) + } catch { + return + } + for (const e of entries) { + if (SKIP_DIRS.has(e.name)) { + continue + } + const full = path.join(dir, e.name) + const rel = path.relative(REPO_ROOT, full) + if (e.isDirectory()) { + yield* walk(full, filter) + } else if (e.isFile() && filter(rel)) { + yield rel + } + } +} + +// ────────────────────────────────────────────────────────────────── +// Rule A + B: code scan (.mts / .cts) +// ────────────────────────────────────────────────────────────────── + +// Locate `path.join(` or `path.resolve(` call sites; argument-list +// extraction uses a paren-balancing scanner below to handle arbitrary +// nesting depth (the previous regex-only approach silently missed any +// argument containing 2+ levels of nested function calls). +const PATH_CALL_RE = /\bpath\.(?:join|resolve)\s*\(/g +const STRING_LITERAL_RE = /(['"])((?:\\.|(?!\1)[^\\])*)\1/g + +// Template literal scanner. Captures backtick-delimited strings +// (including those with `${...}` placeholders) so Rule A also catches +// path construction via template literals like +// `${buildDir}/out/Final/${binary}` or `build/${mode}/out/Final`. +const TEMPLATE_LITERAL_RE = + /`((?:\\.|(?:\$\{(?:[^{}]|\{[^{}]*\})*\})|(?!`)[^\\])*)`/g + +/** + * Convert a template-literal body into a synthetic forward-slash path + * by replacing `${...}` placeholders with a sentinel and normalizing + * separators. Returns the sequence of path segments split on `/`. The + * sentinel doesn't match any STAGE/BUILD_ROOT/MODE token, so a + * placeholder-only segment (`${binaryName}`) won't match those sets. + */ +const templateLiteralSegments = (body: string): string[] => { + // Strip placeholders so they don't introduce noise in segments. + // Empty result for a placeholder is fine; downstream filters by set + // membership and skips empties. + const stripped = body.replace(/\$\{(?:[^{}]|\{[^{}]*\})*\}/g, '\x00') + return stripped.split('/').filter(seg => seg.length > 0 && seg !== '\x00') +} + +/** + * Extract every `path.join(...)` and `path.resolve(...)` call from the + * source text, returning each call's literal start offset and argument + * substring. Uses paren-balancing so deeply-nested arguments like + * `path.join(getDir(child(x)), 'build', 'Final')` are captured fully. + */ +const extractPathCalls = ( + source: string, +): Array<{ offset: number; args: string }> => { + const calls: Array<{ offset: number; args: string }> = [] + PATH_CALL_RE.lastIndex = 0 + let match: RegExpExecArray | null + while ((match = PATH_CALL_RE.exec(source)) !== null) { + const callStart = match.index + const argsStart = PATH_CALL_RE.lastIndex + let depth = 1 + let i = argsStart + let inString: '"' | "'" | '`' | null = null + while (i < source.length && depth > 0) { + const ch = source[i]! + if (inString) { + if (ch === '\\') { + i += 2 + continue + } + if (ch === inString) { + inString = null + } + } else { + if (ch === '"' || ch === "'" || ch === '`') { + inString = ch + } else if (ch === '(') { + depth += 1 + } else if (ch === ')') { + depth -= 1 + if (depth === 0) { + break + } + } + } + i += 1 + } + if (depth === 0) { + calls.push({ offset: callStart, args: source.slice(argsStart, i) }) + PATH_CALL_RE.lastIndex = i + 1 + } + } + return calls +} + +const extractStringLiterals = (args: string): string[] => { + const literals: string[] = [] + let match: RegExpExecArray | null + STRING_LITERAL_RE.lastIndex = 0 + while ((match = STRING_LITERAL_RE.exec(args)) !== null) { + if (match[2] !== undefined) { + literals.push(match[2]) + } + } + return literals +} + +const scanCodeFile = (relPath: string): void => { + const full = path.join(REPO_ROOT, relPath) + let content: string + try { + content = readFileSync(full, 'utf8') + } catch { + return + } + const lines = content.split('\n') + // Build a line-offset map so we can map regex offsets back to line + // numbers cheaply. + const lineOffsets: number[] = [0] + for (let i = 0; i < content.length; i++) { + if (content[i] === '\n') { + lineOffsets.push(i + 1) + } + } + const offsetToLine = (offset: number): number => { + let lo = 0 + let hi = lineOffsets.length - 1 + while (lo < hi) { + const mid = (lo + hi + 1) >>> 1 + if (lineOffsets[mid]! <= offset) { + lo = mid + } else { + hi = mid - 1 + } + } + return lo + 1 + } + + for (const call of extractPathCalls(content)) { + const literals = extractStringLiterals(call.args) + const stages = literals.filter(l => STAGE_SEGMENTS.has(l)) + const buildRoots = literals.filter(l => BUILD_ROOT_SEGMENTS.has(l)) + const modes = literals.filter(l => MODE_SEGMENTS.has(l)) + + // Rule A: 2+ stages OR (1 stage + 1 build-root + 1 mode). + const triggersA = + stages.length >= 2 || + (stages.length >= 1 && buildRoots.length >= 1 && modes.length >= 1) + if (triggersA) { + const line = offsetToLine(call.offset) + const snippet = (lines[line - 1] ?? '').trim() + findings.push({ + rule: 'A', + file: relPath, + line, + snippet, + message: 'Multi-stage path constructed inline (outside paths.mts).', + fix: 'Construct in the owning paths.mts (or use getFinalBinaryPath / getDownloadedDir from build-infra/lib/paths). Import the computed value here.', + }) + } + + // Rule B: each '..' opens a window; the window stays open only + // until the next non-'..' literal. A sibling-package literal + // *immediately after* a '..' (no path segment between them) + // triggers, AND there must be build context elsewhere in the + // call. Resetting per-segment prevents false positives where '..' + // appears earlier and sibling-name appears much later in an + // unrelated position. + const hasBuildContext = literals.some( + l => BUILD_ROOT_SEGMENTS.has(l) || STAGE_SEGMENTS.has(l), + ) + if (hasBuildContext) { + for (let i = 0; i < literals.length - 1; i++) { + if ( + literals[i] === '..' && + KNOWN_SIBLING_PACKAGES.has(literals[i + 1]!) + ) { + const sibling = literals[i + 1]! + const line = offsetToLine(call.offset) + const snippet = (lines[line - 1] ?? '').trim() + findings.push({ + rule: 'B', + file: relPath, + line, + snippet, + message: `Cross-package traversal into '${sibling}' build output.`, + fix: `Add '${sibling}: workspace:*' as a dep, declare an exports entry on '${sibling}' (e.g. './scripts/paths' → './scripts/paths.mts'), and import the path from there.`, + }) + break + } + } + } + } + + // Rule A (template literal variant). Backtick strings like + // `${buildDir}/out/Final/${binary}` or `build/${mode}/${arch}/out/Final` + // construct paths the same way `path.join(...)` does — flag the + // same shapes. Skip raw imports / template tag positions by + // filtering out leading `import.meta.url`-style / tag positions + // implicitly: TEMPLATE_LITERAL_RE matches any backtick string and + // we rely on segment composition to decide if it's a path. + TEMPLATE_LITERAL_RE.lastIndex = 0 + let tmpl: RegExpExecArray | null + while ((tmpl = TEMPLATE_LITERAL_RE.exec(content)) !== null) { + const body = tmpl[1] ?? '' + if (!body.includes('/')) { + continue + } + const segments = templateLiteralSegments(body) + const stages = segments.filter(s => STAGE_SEGMENTS.has(s)) + const buildRoots = segments.filter(s => BUILD_ROOT_SEGMENTS.has(s)) + const modes = segments.filter(s => MODE_SEGMENTS.has(s)) + // Template literal trigger is tighter than path.join() because + // backtick strings often appear in patch fixtures, error messages, + // and other multi-line content that incidentally contains stage + // tokens like `wasm`. Require the canonical build-output shape: + // - 'build' + 'out' + stage (canonical multi-stage layout), OR + // - 2+ stage segments AND 'out' (e.g. `wasm/out/Final`), OR + // - 'build' + stage + literal mode (back-compat with path.join). + const hasBuildAndOut = + buildRoots.includes('build') && buildRoots.includes('out') + const hasOut = buildRoots.includes('out') + const hasBuild = buildRoots.includes('build') + const triggersA = + (hasBuildAndOut && stages.length >= 1) || + (stages.length >= 2 && hasOut) || + (hasBuild && stages.length >= 1 && modes.length >= 1) + if (triggersA) { + const line = offsetToLine(tmpl.index) + const snippet = (lines[line - 1] ?? '').trim() + findings.push({ + rule: 'A', + file: relPath, + line, + snippet, + message: + 'Multi-stage path constructed inline via template literal (outside paths.mts).', + fix: 'Construct in the owning paths.mts (or use getFinalBinaryPath / getDownloadedDir from build-infra/lib/paths). Import the computed value here.', + }) + } + } +} + +// ────────────────────────────────────────────────────────────────── +// Rule C + D: workflow YAML scan +// ────────────────────────────────────────────────────────────────── + +const WORKFLOW_PATH_RE = + /build\/\$\{[^}]+\}\/[^"'`\s]*\/out\/(?:Final|Release|Stripped|Compressed|Optimized|Synced)/g +const WORKFLOW_GH_EXPR_PATH_RE = + /build\/\$\{\{\s*[^}]+\}\}\/[^"'`\s]*\/out\/(?:Final|Release|Stripped|Compressed|Optimized|Synced)/g + +const isInsideComputePathsBlock = ( + lines: string[], + lineIdx: number, +): boolean => { + // Walk backwards up to 60 lines looking for the start of the + // current step. If that step is a "Compute paths" step, the line + // is exempt. + for (let i = lineIdx; i >= Math.max(0, lineIdx - 60); i--) { + const l = lines[i] ?? '' + if (/^\s*-\s*name:/i.test(l)) { + // Step boundary — check if THIS step is a Compute paths step. + // The step body may include `id: paths` even if the name is + // something else (e.g. `id: stub-paths`), so look at the next + // ~20 lines for either marker. + for (let j = i; j < Math.min(lines.length, i + 20); j++) { + const m = lines[j] ?? '' + if ( + /^\s*-\s*name:\s*Compute\s+[\w-]+\s+paths/i.test(m) || + /^\s*id:\s*[\w-]*paths\s*$/i.test(m) + ) { + return true + } + if (j > i && /^\s*-\s*name:/i.test(m)) { + // Hit the next step — current step is NOT Compute paths. + return false + } + } + return false + } + } + return false +} + +const scanWorkflowFile = (relPath: string): void => { + const full = path.join(REPO_ROOT, relPath) + let content: string + try { + content = readFileSync(full, 'utf8') + } catch { + return + } + const lines = content.split('\n') + + // First pass: collect every hand-built path occurrence outside a + // "Compute paths" step. Per the mantra, a single reference is fine + // — what's banned is reconstructing the same path 2+ times. + type PathHit = { + line: number + snippet: string + pathStr: string + } + const occurrences = new Map() + + for (let i = 0; i < lines.length; i++) { + const line = lines[i]! + if (/^\s*#/.test(line)) { + // Skip comment lines from C scan; they're under D below. + continue + } + if (isInsideComputePathsBlock(lines, i)) { + // Inside the canonical construction step — exempt. + continue + } + WORKFLOW_PATH_RE.lastIndex = 0 + WORKFLOW_GH_EXPR_PATH_RE.lastIndex = 0 + const matches: string[] = [] + let m: RegExpExecArray | null + while ((m = WORKFLOW_PATH_RE.exec(line)) !== null) { + matches.push(m[0]) + } + while ((m = WORKFLOW_GH_EXPR_PATH_RE.exec(line)) !== null) { + matches.push(m[0]) + } + for (const pathStr of matches) { + const list = occurrences.get(pathStr) ?? [] + list.push({ line: i + 1, snippet: line.trim(), pathStr }) + occurrences.set(pathStr, list) + } + } + + // Flag every occurrence of a shape that appears 2+ times. + for (const [pathStr, hits] of occurrences) { + if (hits.length < 2) { + continue + } + for (const hit of hits) { + findings.push({ + rule: 'C', + file: relPath, + line: hit.line, + snippet: hit.snippet, + message: `Workflow constructs the same path ${hits.length} times: ${pathStr}`, + fix: 'Add a "Compute paths" step (id: paths) early in the job that computes this path ONCE and exposes it via $GITHUB_OUTPUT. Reference as ${{ steps.paths.outputs. }} in subsequent steps. References of the constructed value are unlimited; reconstructing is the violation.', + }) + } + } + + // Rule D: comments encoding a fully-qualified multi-stage path + // (separate scan since it has different semantics). + for (let i = 0; i < lines.length; i++) { + const line = lines[i]! + if (!/^\s*#/.test(line)) { + continue + } + const literalShape = + /build\/(?:dev|prod|shared)\/[a-z0-9-]+\/(?:wasm\/)?out\/(?:Final|Release|Stripped|Compressed|Optimized|Synced)/i + if (literalShape.test(line)) { + findings.push({ + rule: 'D', + file: relPath, + line: i + 1, + snippet: line.trim(), + message: 'Comment encodes a fully-qualified path string.', + fix: 'Cite the canonical paths.mts (e.g. "see packages//scripts/paths.mts:getBuildPaths()") instead of duplicating the path string. Comments may describe structure with placeholders ("/") but should not be a parsable path.', + }) + } + } +} + +// ────────────────────────────────────────────────────────────────── +// Rule G: Makefile / Dockerfile / shell scan +// ────────────────────────────────────────────────────────────────── + +const SCRIPT_HAND_BUILT_RE = + /build\/\$?\{?(?:BUILD_MODE|MODE|prod|dev)\}?\/[\w${}.-]*\/out\/(?:Final|Release|Stripped|Compressed|Optimized|Synced)/g + +const scanScriptFile = (relPath: string): void => { + const full = path.join(REPO_ROOT, relPath) + let content: string + try { + content = readFileSync(full, 'utf8') + } catch { + return + } + const lines = content.split('\n') + const isDockerfile = + /Dockerfile/i.test(relPath) || /\.glibc$|\.musl$/.test(relPath) + + // First pass: collect every multi-stage path occurrence in this file, + // scoped per Dockerfile stage (each `FROM ... AS ...` starts a new + // scope where ENV/ARG don't propagate). + type Hit = { line: number; text: string; pathStr: string; stage: number } + const hits: Hit[] = [] + let stage = 0 + for (let i = 0; i < lines.length; i++) { + const line = lines[i]! + if (/^\s*#/.test(line)) { + // Skip comments — documentation, not construction. + continue + } + if (isDockerfile && /^FROM\s+/i.test(line)) { + stage += 1 + continue + } + SCRIPT_HAND_BUILT_RE.lastIndex = 0 + let m: RegExpExecArray | null + while ((m = SCRIPT_HAND_BUILT_RE.exec(line)) !== null) { + hits.push({ + line: i + 1, + text: line.trim(), + pathStr: m[0], + stage, + }) + } + } + + // Group by (stage, pathStr) — only flag when a path is built 2+ + // times within the SAME Dockerfile stage (or anywhere in non- + // Dockerfile scripts, where stages don't apply). + const grouped = new Map() + for (const h of hits) { + const key = `${h.stage}::${h.pathStr}` + const list = grouped.get(key) ?? [] + list.push(h) + grouped.set(key, list) + } + for (const [, list] of grouped) { + if (list.length < 2) { + continue + } + for (const hit of list) { + findings.push({ + rule: 'G', + file: relPath, + line: hit.line, + snippet: hit.text, + message: `Hand-built multi-stage path constructed ${list.length} times in this file: ${hit.pathStr}`, + fix: 'Assign to a variable / ENV once near the top of the script / Dockerfile stage, with a comment naming the canonical paths.mts. Reference the variable everywhere downstream. References of a single construction are unlimited; reconstructing the same path is the violation.', + }) + } + } +} + +// ────────────────────────────────────────────────────────────────── +// Rule F: cross-file path repetition +// ────────────────────────────────────────────────────────────────── + +const checkRuleF = (): void => { + // A path is "constructed" each time we see a new path.join with a + // matching shape. Group findings of Rule A by their snippet shape; + // when the same shape appears in 2+ files, demote them to Rule F so + // the message is more accurate. + const byShape = new Map() + for (const f of findings) { + if (f.rule !== 'A') { + continue + } + // Normalize: strip whitespace, identifiers, surrounding context; + // keep just the literal path-segment shape. + const literalsRe = /'[^']*'|"[^"]*"/g + const literals = (f.snippet.match(literalsRe) ?? []).join(',') + if (!literals) { + continue + } + const list = byShape.get(literals) ?? [] + list.push(f) + byShape.set(literals, list) + } + for (const [shape, list] of byShape) { + if (list.length < 2) { + continue + } + // Promote each Rule-A finding in this group to Rule F so the + // message tells the reader the issue is cross-file repetition, + // not just a single hand-build. + for (const f of list) { + f.rule = 'F' + f.message = `Same path shape constructed in ${list.length} places: ${shape.slice(0, 100)}` + f.fix = + 'Construct this path ONCE in a paths.mts (or build-infra helper) and import the computed value. References of the computed variable are unlimited; re-constructing the same shape twice is the violation.' + } + } +} + +// ────────────────────────────────────────────────────────────────── +// Main +// ────────────────────────────────────────────────────────────────── + +const main = (): number => { + // Scan code files (Rule A + B). + for (const rel of walk( + REPO_ROOT, + p => p.endsWith('.mts') || p.endsWith('.cts'), + )) { + if (isExempt(rel)) { + continue + } + scanCodeFile(rel) + } + // Scan workflows (Rule C + D). + const workflowDir = path.join(REPO_ROOT, '.github', 'workflows') + if (existsSync(workflowDir)) { + for (const rel of walk(workflowDir, p => p.endsWith('.yml'))) { + if (isExempt(rel)) { + continue + } + scanWorkflowFile(rel) + } + } + // Scan scripts/Makefiles/Dockerfiles (Rule G). + for (const rel of walk(REPO_ROOT, p => { + const base = path.basename(p) + return ( + base === 'Makefile' || + base.endsWith('.mk') || + base.endsWith('.Dockerfile') || + base === 'Dockerfile' || + base.endsWith('.glibc') || + base.endsWith('.musl') || + (base.endsWith('.sh') && !p.includes('test/')) + ) + })) { + if (isExempt(rel)) { + continue + } + scanScriptFile(rel) + } + // Promote cross-file Rule-A repeats to Rule F. + checkRuleF() + + // Filter against allowlist. + const blocking = findings.filter(f => !isAllowlisted(f)) + + if (args.values.json) { + process.stdout.write( + JSON.stringify( + { findings: blocking, allowlisted: findings.length - blocking.length }, + null, + 2, + ) + '\n', + ) + return blocking.length === 0 ? 0 : 1 + } + + if (blocking.length === 0) { + if (!args.values.quiet) { + logger.success('Path-hygiene check passed (1 path, 1 reference)') + if (findings.length > 0) { + logger.substep(`${findings.length} finding(s) allowlisted`) + } + } + return 0 + } + + logger.error(`Path-hygiene check FAILED — ${blocking.length} finding(s)`) + logger.log('') + logger.log('Mantra: 1 path, 1 reference') + logger.log('') + for (const f of blocking) { + logger.log(` [${f.rule}] ${f.file}:${f.line}`) + logger.log(` ${f.snippet}`) + logger.log(` → ${f.message}`) + if (args.values['show-hashes']) { + logger.log(` snippet_hash: ${snippetHash(f.snippet)}`) + } + if (args.values.explain) { + logger.log(` Fix: ${f.fix}`) + } + logger.log('') + } + if (!args.values.explain) { + logger.log('Run with --explain to see fix suggestions per finding.') + logger.log( + 'Add intentional exceptions to .github/paths-allowlist.yml with a `reason` field.', + ) + logger.log( + 'Run with --show-hashes to print the snippet_hash for each finding (drift-resistant allowlisting).', + ) + } + return 1 +} + +try { + process.exitCode = main() +} catch (e) { + logger.error(`Path-hygiene gate crashed: ${e}`) + process.exitCode = 2 +} diff --git a/.claude/skills/path-guard/reference/claude-md-rule.md b/.claude/skills/path-guard/reference/claude-md-rule.md new file mode 100644 index 000000000..3e32b1bae --- /dev/null +++ b/.claude/skills/path-guard/reference/claude-md-rule.md @@ -0,0 +1,29 @@ + + +## 1 path, 1 reference + +**A path is *constructed* exactly once. Everywhere else *references* the constructed value.** + +Referencing a single computed path many times is fine — that's the whole point of computing it once. What's banned is *re-constructing* the same path in multiple places, because that's where drift is born. + +Three concrete shapes: + +1. **Within a package** — every script, test, and lib file that needs a build path imports it from the package's `scripts/paths.mts` (or `lib/paths.mts`). No `path.join('build', mode, ...)` outside that module. + +2. **Across packages** — when package B consumes package A's output, B imports A's `paths.mts` via the workspace `exports` field. Never `path.join(PKG, '..', '', 'build', ...)`. The R28 yoga/ink bug — ink hand-building yoga's wasm path and missing the `wasm/` segment — is the canonical failure mode this rule prevents. + +3. **Workflows, Dockerfiles, shell scripts** — they can't `import` TS, so they construct the string once and reference it everywhere downstream. Workflows: a "Compute paths" step exposes `steps.paths.outputs.final_dir`; later steps read `${{ steps.paths.outputs.final_dir }}`. Dockerfiles/shell: assign once to a variable / `ENV`, reference by name thereafter. Each canonical construction carries a comment naming the source-of-truth `paths.mts`. **Re-building** the same path in a second step is the violation, not referring to the constructed value many times. + +Comments may describe path *structure* with placeholders ("`/`" or "`${BUILD_MODE}/${PLATFORM_ARCH}`") but should not encode a complete literal path string. Code execution takes priority over docs: violations in `.mts`/`.cts`, Makefiles, Dockerfiles, workflow YAML, and shell scripts are blocking. README and doc-comment violations are advisory unless they contain a fully-qualified path with no parametric placeholders. + +### Three-level enforcement + +- **Hook** — `.claude/hooks/path-guard/` blocks `Edit`/`Write` calls that would introduce a violation in a `.mts`/`.cts` file. Refusal at edit time stops new duplication from landing. +- **Gate** — `scripts/check-paths.mts` runs in `pnpm check`. Fails the build on any violation that isn't allowlisted in `.github/paths-allowlist.yml`. +- **Skill** — `/path-guard` audits the repo and fixes findings; `/path-guard check` reports only; `/path-guard install` drops the gate + hook + rule into a fresh repo. + +The mantra is intentionally short so it sticks: **1 path, 1 reference**. When in doubt, find the canonical owner and import from it. diff --git a/.claude/skills/path-guard/reference/paths-allowlist.yml.tmpl b/.claude/skills/path-guard/reference/paths-allowlist.yml.tmpl new file mode 100644 index 000000000..e2746660c --- /dev/null +++ b/.claude/skills/path-guard/reference/paths-allowlist.yml.tmpl @@ -0,0 +1,28 @@ +# Path-hygiene gate allowlist. +# Mantra: 1 path, 1 reference. +# +# Each entry exempts a specific finding from `scripts/check-paths.mts`. +# Entries MUST carry a `reason` so the list stays audit-able and +# entries can be removed when the underlying code changes. +# +# Schema (all top-level keys optional except `reason`): +# +# - rule: Rule letter (A, B, C, D, F, G). Omit to match any rule. +# file: Substring match against the relative file path. +# pattern: Substring match against the offending snippet. +# line: Line number; matches if within ±2 of the finding. +# reason: Why this site is genuinely exempt. Required. +# +# Prefer narrow entries (rule + file + line + pattern) over blanket +# `file:` entries that exempt the whole file. Genuine exemptions are +# rare — most "false positives" should be reported as gate bugs. +# +# Example: +# +# - rule: A +# file: packages/foo/scripts/legacy-build.mts +# line: 42 +# pattern: "path.join(testDir, 'out', 'Final')" +# reason: | +# legacy-build.mts is scheduled for removal in v2.0; refactoring +# its path construction now would conflict with the rewrite. diff --git a/.claude/skills/programmatic-claude-lockdown/SKILL.md b/.claude/skills/programmatic-claude-lockdown/SKILL.md new file mode 100644 index 000000000..f2561013a --- /dev/null +++ b/.claude/skills/programmatic-claude-lockdown/SKILL.md @@ -0,0 +1,84 @@ +--- +name: programmatic-claude-lockdown +description: Reference for locking down programmatic Claude invocations (the `claude` CLI in workflows/scripts, the `@anthropic-ai/claude-agent-sdk` `query()` in code). Loads on demand when writing or reviewing any callsite that runs Claude programmatically. Source: https://code.claude.com/docs/en/agent-sdk/permissions. +user-invocable: false +allowed-tools: Read, Grep, Glob +--- + +# Programmatic Claude lockdown + +**Rule:** every programmatic Claude callsite sets four flags. Skip any one and a future edit silently widens the surface. + +## The four flags + +| Layer | SDK option | CLI flag | What it does | +|---|---|---|---| +| Definition | `tools` | `--tools` | Base set the model is told about. Tools not listed are invisible — no `tool_use` block possible. | +| Auto-approve | `allowedTools` | `--allowedTools` | Step 4. Listed tools run without invoking `canUseTool`. | +| Deny | `disallowedTools` | `--disallowedTools` | Step 2. Wins even against `bypassPermissions`. Defense-in-depth. | +| Mode | `permissionMode: 'dontAsk'` | `--permission-mode dontAsk` | Step 3. Unmatched tools denied without falling through to a missing `canUseTool`. | + +The official permission flow (1) hooks → (2) deny rules → (3) permission mode → (4) allow rules → (5) `canUseTool`. In `dontAsk` mode step 5 is skipped — denied. The doc states verbatim: *"`allowedTools` and `disallowedTools` ... control whether a tool call is approved, not whether the tool is available."* Availability is `tools`. + +## Recipe — read-only agent (audit, classify, summarize) + +```ts +import { query } from '@anthropic-ai/claude-agent-sdk' + +query({ + prompt: '...', + options: { + tools: ['Read', 'Grep', 'Glob'], + allowedTools: ['Read', 'Grep', 'Glob'], + disallowedTools: ['Agent', 'Bash', 'Edit', 'NotebookEdit', 'Task', 'WebFetch', 'WebSearch', 'Write'], + permissionMode: 'dontAsk', + }, +}) +``` + +CLI form for workflow YAML / shell scripts: + +```yaml +claude --print \ + --tools "Read" "Grep" "Glob" \ + --allowedTools "Read" "Grep" "Glob" \ + --disallowedTools "Agent" "Bash" "Edit" "NotebookEdit" "Task" "WebFetch" "WebSearch" "Write" \ + --permission-mode dontAsk \ + --model "$MODEL" \ + --max-turns 25 \ + "" +``` + +## Recipe — agent that needs Bash (e.g. `/updating`: pnpm + git + jq) + +Narrow `Bash(...)` patterns surgically. Block dangerous Bash patterns explicitly. Fleet rules: no `npx`/`pnpm dlx`/`yarn dlx`; no `curl`/`wget` exfil; no destructive `rm -rf`; no `sudo`. Build the deny list as shell vars so the npx/dlx denials can carry the `# zizmor:` exemption marker (the pre-commit `scanNpxDlx` hook treats those literal strings as the prohibited tools, not as exemptions, unless the line is tagged): + +```yaml +DISALLOW_BASE='Agent Task NotebookEdit WebFetch WebSearch Bash(curl:*) Bash(wget:*) Bash(rm -rf*) Bash(sudo:*)' +DISALLOW_PKG_EXEC='Bash(npx:*) Bash(pnpm dlx:*) Bash(yarn dlx:*)' # zizmor: documentation-prohibition +claude --print \ + --tools "Bash" "Read" "Write" "Edit" "Glob" "Grep" \ + --allowedTools "Bash(pnpm:*)" "Bash(git:*)" "Bash(jq:*)" "Read" "Write" "Edit" "Glob" "Grep" \ + --disallowedTools $DISALLOW_BASE $DISALLOW_PKG_EXEC \ + --permission-mode dontAsk \ + --model "$MODEL" --max-turns 25 \ + "" +``` + +## Never + +- ❌ `permissionMode: 'default'` in headless contexts — falls through to a missing `canUseTool`. Behavior undefined. +- ❌ `permissionMode: 'bypassPermissions'` / `allowDangerouslySkipPermissions: true`. +- ❌ Omitting `tools` — SDK default is the full claude_code preset. +- ❌ `Agent` / `Task` permitted — sub-agents inherit modes and can escape per-subagent restrictions when the parent is `bypassPermissions`/`acceptEdits`/`auto`. + +## Reference implementation + +`socket-lib/tools/prim/src/disambiguate.mts` — canonical SDK-form callsite. The file header documents each flag against the eval-flow step it enforces. + +`socket-lib/tools/prim/test/disambiguate.test.mts` — source-text guards that fail the build if `BASE_TOOLS` widens, if `tools: BASE_TOOLS` is unwired, if `permissionMode` drifts from `'dontAsk'`, or if `bypassPermissions` / `allowDangerouslySkipPermissions: true` ever appears. Mirror this pattern in any new callsite. + +## Existing fleet callsites + +- `socket-registry/.github/workflows/weekly-update.yml` — two `claude --print` invocations (run `/updating` skill, fix test failures). Bash recipe above. +- `socket-lib/tools/prim/src/disambiguate.mts` — read-only recipe above (`query()` SDK form). diff --git a/.claude/skills/promise-race-pitfall/SKILL.md b/.claude/skills/promise-race-pitfall/SKILL.md new file mode 100644 index 000000000..d38f3c2a8 --- /dev/null +++ b/.claude/skills/promise-race-pitfall/SKILL.md @@ -0,0 +1,57 @@ +--- +name: promise-race-pitfall +description: Reference for the `Promise.race` cross-iteration handler-leak bug. Loads on demand when writing or reviewing concurrency code that uses `Promise.race`, `Promise.any`, or hand-rolled concurrency limiters. +--- + +# Promise.race in loops — the handler-leak pitfall + +**Never re-race the same pool of promises across loop iterations.** Each call to `Promise.race([A, B, …])` attaches fresh `.then` handlers to every arm. A promise that survives N iterations accumulates N handler sets. See [nodejs/node#17469](https://github.com/nodejs/node/issues/17469) and [`@watchable/unpromise`](https://github.com/watchable/unpromise). + +## Patterns + +- **Safe** — both arms created per call: + + ```ts + const value = await Promise.race([ + fetchSomething(), + new Promise((_, r) => setTimeout(() => r(new Error('timeout')), 5000)), + ]) + ``` + +- **Leaky** — `pool` survives across iterations, accumulating handlers: + + ```ts + while (queue.length) { + const winner = await Promise.race(pool) // ← N handlers per arm by iteration N + pool = pool.filter(p => p !== winner) + } + ``` + + Same hazard for `Promise.any` and any long-lived arm such as an interrupt signal. + +## The fix + +Use a single-waiter "slot available" signal. Each task's `.then` resolves a one-shot `promiseWithResolvers` that the loop awaits, then replaces. No persistent pool, nothing to stack. + +```ts +let signal = Promise.withResolvers() +function startTask(task: Task) { + task.run().then(() => { + const prev = signal + signal = Promise.withResolvers() + prev.resolve(task) + }) +} +while (queue.length) { + // launch up to N tasks + while (running < N && queue.length) startTask(queue.shift()!) + const finished = await signal.promise + running -= 1 +} +``` + +The arm being awaited is *always fresh*; nothing accumulates handlers. + +## Quick check + +Before merging concurrency code, ask: *does any arm of a `Promise.race`/`Promise.any` outlive the call?* If yes, refactor to the single-waiter signal. diff --git a/.claude/skills/security-scan/SKILL.md b/.claude/skills/security-scan/SKILL.md index 7f2fd77e8..10a3ac3f2 100644 --- a/.claude/skills/security-scan/SKILL.md +++ b/.claude/skills/security-scan/SKILL.md @@ -2,6 +2,7 @@ name: security-scan description: Runs a multi-tool security scan — AgentShield for Claude config, zizmor for GitHub Actions, and optionally Socket CLI for dependency scanning. Produces an A-F graded security report. Use after modifying `.claude/` config, hooks, agents, or GitHub Actions workflows, and before releases. user-invocable: true +allowed-tools: Task, Read, Bash(pnpm exec agentshield:*), Bash(zizmor:*), Bash(command -v:*), Bash(find .cache/external-tools/zizmor:*) --- # Security Scan diff --git a/.git-hooks/_helpers.mts b/.git-hooks/_helpers.mts new file mode 100755 index 000000000..1181e37eb --- /dev/null +++ b/.git-hooks/_helpers.mts @@ -0,0 +1,460 @@ +// Shared helpers for git hooks — API-key allowlist + ANSI colors + +// content scanners. Imported by .git-hooks/{commit-msg,pre-commit, +// pre-push}.mts. No third-party deps; uses only Node built-ins. +// +// Requires Node 25+ for stable .mts type-stripping (no flag needed). +// Earlier Node versions either lacked --experimental-strip-types or +// shipped it under a flag, both unacceptable for hook ergonomics. + +import { spawnSync } from 'node:child_process' +import { existsSync, readFileSync, statSync } from 'node:fs' + +// Hard-fail if Node is below 25. This runs at module load — every +// hook invocation imports _helpers.mts before doing anything, so the +// version check is the first thing that happens. +const NODE_MIN_MAJOR = 25 +const nodeMajor = Number.parseInt( + process.versions.node.split('.')[0] || '0', + 10, +) +if (nodeMajor < NODE_MIN_MAJOR) { + process.stderr.write( + `\x1b[0;31m✗ Hook requires Node >= ${NODE_MIN_MAJOR}.0.0 (have v${process.versions.node})\x1b[0m\n`, + ) + process.stderr.write( + 'Install Node 25+ — these hooks rely on stable .mts type stripping.\n', + ) + process.exit(1) +} + +// ── Allowlist constants ──────────────────────────────────────────── +// These exempt known-safe matches from the API-key scanner. Each +// allowlist entry is a substring; if the matched line contains it, +// the line is dropped from the findings. + +// Real public API key shipped in socket-lib test fixtures. Safe to +// appear anywhere in the fleet. +export const ALLOWED_PUBLIC_KEY = + 'sktsec_t_--RAN5U4ivauy4w37-6aoKyYPDt5ZbaT5JBVMqiwKo_api' + +// Substring marker used in test fixtures (see +// socket-lib/test/unit/utils/fake-tokens.ts). Lines containing this +// are treated as test fixtures. +export const FAKE_TOKEN_MARKER = 'socket-test-fake-token' + +// Legacy lib-scoped marker — accepted during the rename from +// `socket-lib-test-fake-token` to `socket-test-fake-token`. Drop when +// lib's rename PR lands. +export const FAKE_TOKEN_LEGACY = 'socket-lib-test-fake-token' + +// Name of the env var used in shell examples; not a token value. +export const SOCKET_SECURITY_ENV = 'SOCKET_SECURITY_API_KEY=' + +// ── ANSI colors ──────────────────────────────────────────────────── + +export const RED = '\x1b[0;31m' +export const GREEN = '\x1b[0;32m' +export const YELLOW = '\x1b[1;33m' +export const NC = '\x1b[0m' + +// ── Output helpers ───────────────────────────────────────────────── + +export const out = (msg: string): void => { + process.stdout.write(msg + '\n') +} + +export const err = (msg: string): void => { + process.stderr.write(msg + '\n') +} + +export const red = (msg: string): string => `${RED}${msg}${NC}` +export const green = (msg: string): string => `${GREEN}${msg}${NC}` +export const yellow = (msg: string): string => `${YELLOW}${msg}${NC}` + +// ── API-key allowlist filter ─────────────────────────────────────── + +// Drops any line that matches an allowlist entry. +export const filterAllowedApiKeys = (lines: readonly string[]): string[] => { + return lines.filter( + line => + !line.includes(ALLOWED_PUBLIC_KEY) && + !line.includes(FAKE_TOKEN_MARKER) && + !line.includes(FAKE_TOKEN_LEGACY) && + !line.includes(SOCKET_SECURITY_ENV) && + !line.includes('.example'), + ) +} + +// ── Personal-path scanner ────────────────────────────────────────── + +// Real personal paths to flag: /Users/foo/, /home/foo/, C:\Users\foo\. +const PERSONAL_PATH_RE = + /(\/Users\/[^/\s]+\/|\/home\/[^/\s]+\/|C:\\Users\\[^\\]+\\)/ + +// Placeholders we ALLOW (documentation, not real leaks): any path +// component wrapped in <...> or starting with $VAR / ${VAR}. +const PERSONAL_PATH_PLACEHOLDER_RE = + /(\/Users\/<[^>]*>\/|\/home\/<[^>]*>\/|C:\\Users\\<[^>]*>\\|\/Users\/\$\{?[A-Z_]+\}?\/|\/home\/\$\{?[A-Z_]+\}?\/)/ + +// Per-line opt-out marker for our pre-commit / pre-push scanners. +// +// Canonical form: # socket-hook: allow +// Targeted form: # socket-hook: allow +// +// The targeted form names a specific rule (`personal-path`, `npx`, +// `aws-key`, etc.) and is recommended for reviewers; the bare `allow` +// form blanket-suppresses every scanner on that line. eslint-style +// precedent. +// +// Legacy `# zizmor: ...` markers are still recognized for one cycle so +// existing files don't have to be rewritten in the same change that +// renames the marker. +const SOCKET_HOOK_MARKER_RE = /#\s*socket-hook:\s*allow(?:\s+([\w-]+))?/ +const LEGACY_ZIZMOR_MARKER_RE = /#\s*zizmor:\s*[\w-]+/ + +function lineIsSuppressed(line: string, rule?: string): boolean { + if (LEGACY_ZIZMOR_MARKER_RE.test(line)) { + return true + } + const m = line.match(SOCKET_HOOK_MARKER_RE) + if (!m) { + return false + } + // No rule named on the marker → blanket allow. + if (!m[1]) { + return true + } + // Marker named a specific rule → only suppress that rule. + return rule === undefined || m[1] === rule +} + +// Heuristic context flags: lines that look like "this is a doc example" +// rather than a real call leaked into runtime code. +// - Comment lines (start with `*`, `//`, `#`). +// - Lines that contain a JSDoc tag like @example / @param / @returns +// (multi-line JSDoc bodies use leading ` * ` which we already match). +// - Lines whose entire interesting content sits inside a backtick span +// (markdown / template-literal example). +const COMMENT_LINE_RE = /^\s*(\*|\/\/|#)/ +const JSDOC_TAG_RE = /@(example|param|returns?|see|link)\b/ + +function isInsideBackticks(line: string, needleRe: RegExp): boolean { + // Find every backtick-delimited span on the line and test if the + // pattern only appears within those spans. Conservative: if any + // hit is *outside* a span, treat the line as runtime code. + const spans: Array<[number, number]> = [] + for (let i = 0; i < line.length; i++) { + if (line[i] === '`') { + const end = line.indexOf('`', i + 1) + if (end < 0) { + break + } + spans.push([i, end]) + i = end + } + } + if (spans.length === 0) { + return false + } + let m: RegExpExecArray | null + const re = new RegExp(needleRe.source, needleRe.flags.replace('g', '') + 'g') + while ((m = re.exec(line)) !== null) { + const start = m.index + const end = start + m[0].length + const inside = spans.some(([s, e]) => start > s && end <= e) + if (!inside) { + return false + } + } + return true +} + +function looksLikeDocumentation( + line: string, + needleRe: RegExp, + rule?: string, +): boolean { + if (lineIsSuppressed(line, rule)) { + return true + } + if (COMMENT_LINE_RE.test(line)) { + return true + } + if (JSDOC_TAG_RE.test(line)) { + return true + } + if (isInsideBackticks(line, needleRe)) { + return true + } + return false +} + +export type LineHit = { + lineNumber: number + line: string + // Suggested rewrite when this flagged line is documentation-style and + // the scanner can offer a concrete fix. Undefined for runtime-code + // paths where the right answer depends on the surrounding code. + suggested?: string +} + +// Build a suggested rewrite for a documentation-style personal path. +// Replaces the matched real-path username segment with the canonical +// placeholder form: `` / `` (matching the platform +// convention of the surrounding path). +function suggestPlaceholder(line: string): string { + return line + .replace(/\/Users\/[^/\s]+\//g, '/Users//') + .replace(/\/home\/[^/\s]+\//g, '/home//') + .replace(/C:\\Users\\[^\\]+\\/g, 'C:\\Users\\\\') +} + +// Returns lines that contain a real personal path (excludes lines that +// are pure placeholders or look like documentation examples). Each hit +// carries a `suggested` rewrite when the scanner can offer one — the +// caller surfaces it to the user as the fix recipe. +export const scanPersonalPaths = (text: string): LineHit[] => { + const hits: LineHit[] = [] + const lines = text.split('\n') + for (let i = 0; i < lines.length; i++) { + const line = lines[i]! + if (!PERSONAL_PATH_RE.test(line)) { + continue + } + if (PERSONAL_PATH_PLACEHOLDER_RE.test(line)) { + const stripped = line.replace( + new RegExp(PERSONAL_PATH_PLACEHOLDER_RE, 'g'), + '', + ) + if (!PERSONAL_PATH_RE.test(stripped)) { + continue + } + } + if (looksLikeDocumentation(line, PERSONAL_PATH_RE, 'personal-path')) { + continue + } + hits.push({ + lineNumber: i + 1, + line, + suggested: suggestPlaceholder(line), + }) + } + return hits +} + +// ── Secret scanners ──────────────────────────────────────────────── + +const SOCKET_API_KEY_RE = /sktsec_[a-zA-Z0-9_-]+/ +const AWS_KEY_RE = /(aws_access_key|aws_secret|\bAKIA[0-9A-Z]{16}\b)/i +const GITHUB_TOKEN_RE = /gh[ps]_[a-zA-Z0-9]{36}/ +const PRIVATE_KEY_RE = /-----BEGIN (RSA |EC |DSA )?PRIVATE KEY-----/ + +export const scanSocketApiKeys = (text: string): LineHit[] => { + const hits: LineHit[] = [] + const lines = text.split('\n') + for (let i = 0; i < lines.length; i++) { + const line = lines[i]! + if (SOCKET_API_KEY_RE.test(line)) { + hits.push({ lineNumber: i + 1, line }) + } + } + return filterAllowedApiKeys(hits.map(h => h.line)).map(line => ({ + lineNumber: hits.find(h => h.line === line)!.lineNumber, + line, + })) +} + +export const scanAwsKeys = (text: string): LineHit[] => { + const hits: LineHit[] = [] + const lines = text.split('\n') + for (let i = 0; i < lines.length; i++) { + const line = lines[i]! + if (AWS_KEY_RE.test(line)) { + hits.push({ lineNumber: i + 1, line }) + } + } + return hits +} + +export const scanGitHubTokens = (text: string): LineHit[] => { + const hits: LineHit[] = [] + const lines = text.split('\n') + for (let i = 0; i < lines.length; i++) { + const line = lines[i]! + if (GITHUB_TOKEN_RE.test(line)) { + hits.push({ lineNumber: i + 1, line }) + } + } + return hits +} + +export const scanPrivateKeys = (text: string): LineHit[] => { + const hits: LineHit[] = [] + const lines = text.split('\n') + for (let i = 0; i < lines.length; i++) { + const line = lines[i]! + if (PRIVATE_KEY_RE.test(line)) { + hits.push({ lineNumber: i + 1, line }) + } + } + return hits +} + +// ── npx/dlx scanner ──────────────────────────────────────────────── + +const NPX_DLX_RE = /\b(npx|pnpm dlx|yarn dlx)\b/ + +// Suggest the canonical replacement for a runtime npx/dlx call. +// Documentation contexts (comments, JSDoc) are exempt via +// looksLikeDocumentation(); we only ever land here for code lines, where +// the right swap is `pnpm exec` (since `pnpm` is the fleet's package +// manager) or `pnpm run` for script entries. +function suggestNpxReplacement(line: string): string { + return line + .replace(/\bpnpm dlx\b/g, 'pnpm exec') + .replace(/\byarn dlx\b/g, 'pnpm exec') + .replace(/\bnpx\b/g, 'pnpm exec') +} + +export const scanNpxDlx = (text: string): LineHit[] => { + const hits: LineHit[] = [] + const lines = text.split('\n') + for (let i = 0; i < lines.length; i++) { + const line = lines[i]! + if (!NPX_DLX_RE.test(line)) { + continue + } + if (looksLikeDocumentation(line, NPX_DLX_RE, 'npx')) { + continue + } + hits.push({ + lineNumber: i + 1, + line, + suggested: suggestNpxReplacement(line), + }) + } + return hits +} + +// ── Logger leak scanner ──────────────────────────────────────────── +// +// The fleet rule: source code uses `getDefaultLogger()` from +// `@socketsecurity/lib/logger`. Direct calls to `process.stderr.write`, +// `process.stdout.write`, `console.log`, `console.error`, `console.warn`, +// `console.info`, `console.debug` are blocked. Doc-context lines are +// exempt; lines carrying `# socket-hook: allow logger` are exempt too. + +const LOGGER_LEAK_RE = + /\b(process\.std(?:err|out)\.write|console\.(?:log|error|warn|info|debug))\s*\(/ + +// Map each direct call to its lib-logger equivalent. process.stdout is +// closer to logger.info; process.stderr / console.error → logger.error; +// console.warn → logger.warn; console.info / console.log → logger.info; +// console.debug → logger.debug. +function suggestLoggerReplacement(line: string): string { + return line + .replace(/\bprocess\.stderr\.write\s*\(/g, 'logger.error(') + .replace(/\bprocess\.stdout\.write\s*\(/g, 'logger.info(') + .replace(/\bconsole\.error\s*\(/g, 'logger.error(') + .replace(/\bconsole\.warn\s*\(/g, 'logger.warn(') + .replace(/\bconsole\.info\s*\(/g, 'logger.info(') + .replace(/\bconsole\.debug\s*\(/g, 'logger.debug(') + .replace(/\bconsole\.log\s*\(/g, 'logger.info(') +} + +export const scanLoggerLeaks = (text: string): LineHit[] => { + const hits: LineHit[] = [] + const lines = text.split('\n') + for (let i = 0; i < lines.length; i++) { + const line = lines[i]! + if (!LOGGER_LEAK_RE.test(line)) { + continue + } + if (looksLikeDocumentation(line, LOGGER_LEAK_RE, 'logger')) { + continue + } + hits.push({ + lineNumber: i + 1, + line, + suggested: suggestLoggerReplacement(line), + }) + } + return hits +} + +// ── AI attribution scanner ───────────────────────────────────────── + +const AI_ATTRIBUTION_RE = + /(Generated with.*(Claude|AI)|Co-Authored-By: Claude|Co-Authored-By: AI|🤖 Generated|AI generated|@anthropic\.com|Assistant:|Generated by Claude|Machine generated|Claude Code)/i + +export const containsAiAttribution = (text: string): boolean => + AI_ATTRIBUTION_RE.test(text) + +export const stripAiAttribution = ( + text: string, +): { cleaned: string; removed: number } => { + const lines = text.split('\n') + const kept: string[] = [] + let removed = 0 + for (const line of lines) { + if (AI_ATTRIBUTION_RE.test(line)) { + removed++ + } else { + kept.push(line) + } + } + return { cleaned: kept.join('\n'), removed } +} + +// ── File classification ──────────────────────────────────────────── + +// Files we never scan: hooks themselves, husky shims, test fixtures. +const SKIP_FILE_RE = + /\.(test|spec)\.(m?[jt]s|tsx?|cts|mts)$|\.example$|\/test\/|\/tests\/|fixtures\/|\.git-hooks\/|\.husky\/|node_modules\/|pnpm-lock\.yaml/ + +export const shouldSkipFile = (filePath: string): boolean => + SKIP_FILE_RE.test(filePath) + +// Returns file content as a string. For binaries, runs `strings` to +// extract printable byte sequences (catches paths embedded in WASM +// or other compiled artifacts). +export const readFileForScan = (filePath: string): string => { + if (!existsSync(filePath)) { + return '' + } + try { + if (statSync(filePath).isDirectory()) { + return '' + } + } catch { + return '' + } + // Detect binary via grep -I (matches text-only); if grep says + // binary, fall back to `strings`. + const grepResult = spawnSync('grep', ['-qI', '', filePath]) + if (grepResult.status === 0) { + // Text file. + try { + return readFileSync(filePath, 'utf8') + } catch { + return '' + } + } + // Binary — extract strings. + const stringsResult = spawnSync('strings', [filePath], { + encoding: 'utf8', + }) + return stringsResult.stdout || '' +} + +// ── Git wrappers ─────────────────────────────────────────────────── + +export const git = (...args: string[]): string => { + const result = spawnSync('git', args, { encoding: 'utf8' }) + return result.stdout.trim() +} + +export const gitLines = (...args: string[]): string[] => { + const out = git(...args) + return out ? out.split('\n') : [] +} diff --git a/.git-hooks/_helpers.sh b/.git-hooks/_helpers.sh deleted file mode 100644 index 15e9a4083..000000000 --- a/.git-hooks/_helpers.sh +++ /dev/null @@ -1,43 +0,0 @@ -#!/bin/bash -# Shared helpers for git hooks. -# Sourced by .git-hooks/commit-msg, pre-commit, pre-push. -# -# Constants -# --------- -# ALLOWED_PUBLIC_KEY Real public API key shipped in socket-lib test -# fixtures. Safe to appear in commits anywhere. -# FAKE_TOKEN_MARKER Substring marker used in fleet test fixtures. -# FAKE_TOKEN_LEGACY Legacy lib-scoped marker — accepted during the -# rename from `socket-lib-test-fake-token` to -# `socket-test-fake-token`. Drop when socket-lib's -# fixture rename PR lands. -# SOCKET_SECURITY_ENV Env var name used in shell examples; not a token. -# -# Functions -# --------- -# filter_allowed_api_keys Reads stdin, drops allowlist matches (public -# key, fake-token markers, env var name, -# `.example` paths), prints the rest. -# -# Colors -# ------ -# RED, GREEN, YELLOW, NC - -# shellcheck disable=SC2034 # constants sourced by other hooks -ALLOWED_PUBLIC_KEY="sktsec_t_--RAN5U4ivauy4w37-6aoKyYPDt5ZbaT5JBVMqiwKo_api" -FAKE_TOKEN_MARKER="socket-test-fake-token" -FAKE_TOKEN_LEGACY="socket-lib-test-fake-token" -SOCKET_SECURITY_ENV="SOCKET_SECURITY_API_KEY=" - -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -NC='\033[0m' - -filter_allowed_api_keys() { - grep -v "$ALLOWED_PUBLIC_KEY" \ - | grep -v "$FAKE_TOKEN_MARKER" \ - | grep -v "$FAKE_TOKEN_LEGACY" \ - | grep -v "$SOCKET_SECURITY_ENV" \ - | grep -v '\.example' -} diff --git a/.git-hooks/commit-msg b/.git-hooks/commit-msg deleted file mode 100755 index 7acf4c56b..000000000 --- a/.git-hooks/commit-msg +++ /dev/null @@ -1,90 +0,0 @@ -#!/bin/bash -# Socket Security Commit-msg Hook -# Additional security layer - validates commit even if pre-commit was bypassed. - -set -e - -# shellcheck source=./_helpers.sh -. "$(dirname "$0")/_helpers.sh" - -ERRORS=0 - -# Get files in this commit (for security checks). -COMMITTED_FILES=$(git diff --cached --name-only --diff-filter=ACM 2>/dev/null || printf "\n") - -# Quick checks for critical issues in committed files. -if [ -n "$COMMITTED_FILES" ]; then - for file in $COMMITTED_FILES; do - if [ -f "$file" ]; then - # Check for Socket API keys (except allowed). - if grep -E 'sktsec_[a-zA-Z0-9_-]+' "$file" 2>/dev/null | filter_allowed_api_keys | grep -q .; then - printf "${RED}✗ SECURITY: Potential API key detected in commit!${NC}\n" - printf "File: %s\n" "$file" - ERRORS=$((ERRORS + 1)) - fi - - # Check for .env files. - if echo "$file" | grep -qE '^\.env(\.[^/]+)?$' && ! echo "$file" | grep -qE '^\.env\.(example|test)$'; then - printf "${RED}✗ SECURITY: .env file in commit!${NC}\n" - ERRORS=$((ERRORS + 1)) - fi - fi - done -fi - -# Block Linear issue references in the commit message. -# Linear tracking lives in Linear; keep commit history tool-agnostic. -# Team keys enumerated from the Socket workspace. PATCH listed before PAT so -# the engine matches the longer prefix first on strings like "PATCH-123". -COMMIT_MSG_FILE="$1" -LINEAR_TEAM_KEYS='ASK|AUTO|BOT|CE|CORE|DAT|DES|DEV|ENG|INFRA|LAB|MAR|MET|OPS|PAR|PATCH|PAT|PLAT|REA|SALES|SBOM|SEC|SMO|SUP|TES|TI|WEB' -if [ -f "$COMMIT_MSG_FILE" ]; then - LINEAR_HITS=$(grep -vE '^#' "$COMMIT_MSG_FILE" 2>/dev/null \ - | grep -oE "(^|[^A-Za-z0-9_])($LINEAR_TEAM_KEYS)-[0-9]+($|[^A-Za-z0-9_])|linear\.app/[A-Za-z0-9/_-]+" \ - | head -5 || true) - if [ -n "$LINEAR_HITS" ]; then - printf "${RED}✗ Commit message references Linear issue(s):${NC}\n" - printf '%s\n' "$LINEAR_HITS" | sed 's/^/ /' - printf "${RED}Linear tracking lives in Linear. Remove the reference from the commit message.${NC}\n" - ERRORS=$((ERRORS + 1)) - fi -fi - -# Auto-strip AI attribution from commit message. -if [ -f "$COMMIT_MSG_FILE" ]; then - # Create a temporary file to store the cleaned message. - TEMP_FILE=$(mktemp) || { - printf "${RED}✗ Failed to create temporary file${NC}\n" >&2 - exit 1 - } - # Ensure cleanup on exit - trap 'rm -f "$TEMP_FILE"' EXIT - REMOVED_LINES=0 - - # Read the commit message line by line and filter out AI attribution. - while IFS= read -r line || [ -n "$line" ]; do - # Check if this line contains AI attribution patterns. - if echo "$line" | grep -qiE "(Generated with|Co-Authored-By: Claude|Co-Authored-By: AI|🤖 Generated|AI generated|Claude Code|@anthropic|Assistant:|Generated by Claude|Machine generated)"; then - REMOVED_LINES=$((REMOVED_LINES + 1)) - else - # Line doesn't contain AI attribution, keep it. - printf '%s\n' "$line" >> "$TEMP_FILE" - fi - done < "$COMMIT_MSG_FILE" - - # Replace the original commit message with the cleaned version. - if [ $REMOVED_LINES -gt 0 ]; then - mv "$TEMP_FILE" "$COMMIT_MSG_FILE" - printf "${GREEN}✓ Auto-stripped${NC} $REMOVED_LINES AI attribution line(s) from commit message\n" - else - # No lines were removed, just clean up the temp file. - rm -f "$TEMP_FILE" - fi -fi - -if [ $ERRORS -gt 0 ]; then - printf "${RED}✗ Commit blocked by security validation${NC}\n" - exit 1 -fi - -exit 0 diff --git a/.git-hooks/commit-msg.mts b/.git-hooks/commit-msg.mts new file mode 100755 index 000000000..ae9186e19 --- /dev/null +++ b/.git-hooks/commit-msg.mts @@ -0,0 +1,90 @@ +#!/usr/bin/env node +// Socket Security Commit-msg Hook +// +// Two responsibilities: +// 1. Block commits that introduce API keys / .env files (security +// layer that runs even when pre-commit is bypassed via +// `--no-verify`). +// 2. Auto-strip AI attribution lines from the commit message before +// git records the commit. +// +// Wired via .husky/commit-msg, which invokes this with the path to the +// commit message file as argv[2] (after the script path itself). + +import { existsSync, readFileSync, writeFileSync } from 'node:fs' + +import { basename } from 'node:path' +import process from 'node:process' + +import { + err, + gitLines, + green, + out, + red, + readFileForScan, + scanSocketApiKeys, + shouldSkipFile, + stripAiAttribution, +} from './_helpers.mts' + +const main = (): number => { + let errors = 0 + const committedFiles = gitLines( + 'diff', + '--cached', + '--name-only', + '--diff-filter=ACM', + ) + + for (const file of committedFiles) { + if (!file || shouldSkipFile(file)) { + continue + } + const text = readFileForScan(file) + if (!text) { + continue + } + + // Socket API keys (allowlist-aware). + const apiHits = scanSocketApiKeys(text) + if (apiHits.length > 0) { + out(red('✗ SECURITY: Potential API key detected in commit!')) + out(`File: ${file}`) + errors++ + } + + // .env files at any depth — allow only .env.example, .env.test, + // .env.precommit (templates / tracked placeholders). + const base = basename(file) + if ( + /^\.env(\.[^/]+)?$/.test(base) && + !/^\.env\.(example|test|precommit)$/.test(base) + ) { + out(red('✗ SECURITY: .env file in commit!')) + out(`File: ${file}`) + errors++ + } + } + + // Auto-strip AI attribution lines from the commit message. + const commitMsgFile = process.argv[2] + if (commitMsgFile && existsSync(commitMsgFile)) { + const original = readFileSync(commitMsgFile, 'utf8') + const { cleaned, removed } = stripAiAttribution(original) + if (removed > 0) { + writeFileSync(commitMsgFile, cleaned) + out( + `${green('✓ Auto-stripped')} ${removed} AI attribution line(s) from commit message`, + ) + } + } + + if (errors > 0) { + err(red('✗ Commit blocked by security validation')) + return 1 + } + return 0 +} + +process.exit(main()) diff --git a/.git-hooks/pre-commit.mts b/.git-hooks/pre-commit.mts new file mode 100755 index 000000000..d4b885a4a --- /dev/null +++ b/.git-hooks/pre-commit.mts @@ -0,0 +1,264 @@ +#!/usr/bin/env node +// Socket Security Pre-commit Hook +// +// Local-defense layer: scans staged files for sensitive content +// before git records the commit. Mandatory enforcement re-runs in +// pre-push for the final gate. +// +// Bypassable: --no-verify skips this hook entirely. Use sparingly +// (hotfixes, history operations, pre-build states). + +import { basename } from 'node:path' +import process from 'node:process' + +import { + err, + gitLines, + green, + out, + red, + readFileForScan, + scanAwsKeys, + scanGitHubTokens, + scanLoggerLeaks, + scanNpxDlx, + scanPersonalPaths, + scanPrivateKeys, + scanSocketApiKeys, + shouldSkipFile, + yellow, +} from './_helpers.mts' + +const main = (): number => { + out(green('Running Socket Security checks...')) + const stagedFiles = gitLines( + 'diff', + '--cached', + '--name-only', + '--diff-filter=ACM', + ) + if (stagedFiles.length === 0) { + out(green('✓ No files to check')) + return 0 + } + + let errors = 0 + + // .DS_Store files. + out('Checking for .DS_Store files...') + const dsStores = stagedFiles.filter(f => f.includes('.DS_Store')) + if (dsStores.length > 0) { + out(red('✗ ERROR: .DS_Store file detected!')) + dsStores.forEach(f => out(f)) + errors++ + } + + // Log files (ignore test logs). + out('Checking for log files...') + const logs = stagedFiles.filter( + f => f.endsWith('.log') && !/test.*\.log$/.test(f), + ) + if (logs.length > 0) { + out(red('✗ ERROR: Log file detected!')) + logs.forEach(f => out(f)) + errors++ + } + + // .env files at any depth — allow only .env.example, .env.test, + // .env.precommit (templates / tracked placeholders). Match the + // commit-msg.mts behavior: a nested .env.local is just as much a + // leak as a root-level one. basename() catches both. + out('Checking for .env files...') + const envFiles = stagedFiles.filter(f => { + const base = basename(f) + return ( + /^\.env(\.[^/]+)?$/.test(base) && + !/^\.env\.(example|test|precommit)$/.test(base) + ) + }) + if (envFiles.length > 0) { + out(red('✗ ERROR: .env file detected!')) + envFiles.forEach(f => out(f)) + out( + 'These files should never be committed. Use .env.example for templates.', + ) + errors++ + } + + // Hardcoded personal paths. + out('Checking for hardcoded personal paths...') + for (const file of stagedFiles) { + if (shouldSkipFile(file)) { + continue + } + const text = readFileForScan(file) + if (!text) { + continue + } + const hits = scanPersonalPaths(text) + if (hits.length > 0) { + out(red(`✗ ERROR: Hardcoded personal path found in: ${file}`)) + for (const h of hits.slice(0, 3)) { + out(`${h.lineNumber}: ${h.line.trim()}`) + if (h.suggested && h.suggested !== h.line) { + out(` fix: ${h.suggested.trim()}`) + } + } + out( + 'Replace with `` / `` placeholders, an env var ' + + '(`$HOME`, `${USER}`), or — for documentation lines that need ' + + 'the literal username form — append the marker ' + + '`# zizmor: documentation-placeholder`.', + ) + errors++ + } + } + + // Socket API keys (warning, not blocking). + out('Checking for API keys...') + for (const file of stagedFiles) { + if (shouldSkipFile(file)) { + continue + } + const text = readFileForScan(file) + if (!text) { + continue + } + const hits = scanSocketApiKeys(text) + if (hits.length > 0) { + out(yellow(`⚠ WARNING: Potential API key found in: ${file}`)) + hits.slice(0, 3).forEach(h => out(`${h.lineNumber}:${h.line.trim()}`)) + out('If this is a real API key, DO NOT COMMIT IT.') + } + } + + // Other secret patterns (AWS, GitHub, private keys). + out('Checking for potential secrets...') + for (const file of stagedFiles) { + if (shouldSkipFile(file)) { + continue + } + const text = readFileForScan(file) + if (!text) { + continue + } + + const aws = scanAwsKeys(text) + if (aws.length > 0) { + out(red(`✗ ERROR: Potential AWS credentials found in: ${file}`)) + aws.slice(0, 3).forEach(h => out(`${h.lineNumber}:${h.line.trim()}`)) + errors++ + } + + const gh = scanGitHubTokens(text) + if (gh.length > 0) { + out(red(`✗ ERROR: Potential GitHub token found in: ${file}`)) + gh.slice(0, 3).forEach(h => out(`${h.lineNumber}:${h.line.trim()}`)) + errors++ + } + + const pk = scanPrivateKeys(text) + if (pk.length > 0) { + out(red(`✗ ERROR: Private key found in: ${file}`)) + errors++ + } + } + + // npx/dlx usage. + out('Checking for npx/dlx usage...') + for (const file of stagedFiles) { + if ( + file.includes('node_modules/') || + file.endsWith('pnpm-lock.yaml') || + file.includes('.git-hooks/') || + // CHANGELOG entries discuss npx ecosystem *behavior* (cache + // semantics, naming conventions) as historical documentation — + // they're not commands. Skip the npx/dlx scan for changelogs. + file === 'CHANGELOG.md' || + file.endsWith('/CHANGELOG.md') + ) { + continue + } + const text = readFileForScan(file) + if (!text) { + continue + } + const hits = scanNpxDlx(text) + if (hits.length > 0) { + out(red(`✗ ERROR: npx/dlx usage found in: ${file}`)) + for (const h of hits.slice(0, 3)) { + out(`${h.lineNumber}: ${h.line.trim()}`) + if (h.suggested && h.suggested !== h.line) { + out(` fix: ${h.suggested.trim()}`) + } + } + out( + "Use 'pnpm exec ' or 'pnpm run