Create force-multiplier delivery skill#153
Conversation
Plugin Validation Summary —
|
| Check | Result |
|---|---|
plugin.json manifest (name, version, required fields, valid JSON) |
✅ name: bitwarden-delivery-tools, version: 2.1.0 (valid semver), author/homepage/repository/keywords well-formed |
| Version consistency | ✅ 2.1.0 agrees across plugin.json:3, .claude-plugin/marketplace.json entry, and CHANGELOG.md:8 |
| Marketplace description parity | ✅ marketplace description matches plugin.json |
| CHANGELOG (Keep a Changelog) | ✅ New [2.1.0] - 2026-07-01 entry under ### Added documents both the skill and the eval set |
| Directory / auto-discovery | ✅ Skill at skills/force-multiplier/SKILL.md; references/, examples/, evals/ present |
| Referenced-file integrity | ✅ Every path cited in SKILL.md exists (all 5 references + 3 examples) |
| File organization | ✅ Plugin README.md present and updated (README.md:41); no stray files (.DS_Store, logs, node_modules) |
| Hardcoded credentials | ✅ None |
Informational (not a defect): skills/force-multiplier/evals/evals.json line 35 (eval id 3) self-reports with-skill 6/9 vs baseline 7/9 on the "no-false-negatives / SELECT" expectation, flagged by the authors as a candidate for a stronger SELECT steer. Honest self-reporting, not a validation failure.
2. Skill Review (skill-reviewer) — force-multiplier/SKILL.md
Status: PASS — 0 critical, 0 major, 2 minor (both optional).
- Frontmatter: All fields valid —
namematches directory;description,when_to_use,argument-hint,allowed-toolsall present and well-formed.allowed-toolscorrectly declares theSkill(...)teaming dependencies used in the body. - Description quality: Third-person/infinitive form, concrete, ~180 chars. Strong, specific trigger phrasings live in
when_to_use("across all repos", "fleet-wide", "mass update", etc.). - Content: 1,244 words (within the 1,000–3,000 target); consistently imperative/infinitive writing; clear logical flow.
- Progressive disclosure: Exemplary. Lean core; detailed mechanics pushed to
references/pipeline.md, schema toreferences/campaign-spec.md, safety toreferences/safety-and-self-checks.md, deferred scope toreferences/deferred.md, discovery toreferences/finding-targets.md; worked instances inexamples/. All referenced files exist and are non-trivial.
Minor / warnings (should-fix, non-blocking):
SKILL.md:11—examples/is referenced only as a directory, never by individual filename. Naming a representative example (e.g.npm-to-pnpm.md) would sharpen model discovery. Suggestion, not a fix.SKILL.md:4—when_to_useis a deprecated field under current skill-authoring guidance (trigger phrasings are meant to fold intodescription); the field is still honored and the content is excellent. Low-priority stylistic call — consolidating would makedescriptionlong.
Highlights: Strong safety framing (the "Check yourself, Claude" reality-check gate, non-negotiable safety defaults, treating target content as untrusted data for prompt-injection defense); reconciliation arithmetic that prevents silent target loss; clean teaming model resolving the interactive creating-pull-request dependency once at pilot.
3. Security Validation (reviewing-claude-config)
Status: PASS — no security issues.
- Committed secrets / hardcoded credentials: ✅ None. A full-plugin sweep for GitHub/Slack/AWS tokens and PEM private keys returned no matches. Every
secret/token/credentialhit is documentation of the skill's own secrets-scanning safety controls (e.g.safety-and-self-checks.md:63-71,SKILL.md:69-71). TheAKIA[0-9A-Z]{16}string atsafety-and-self-checks.md:65is part of a recommended secrets-detection grep pattern, not a credential. - Settings / permission scoping: ✅ No
settings.json/settings.local.json/hooks files changed; nothing to scope. - Dangerous auto-approvals / broad file access: ✅ None.
allowed-tools(Bash, Read, Write, Edit, Glob, Grep, Agent, Skill(...)) is appropriate for a fan-out skill and scoped to the declared teaming skills. - Security posture of the skill itself is a positive: treats all target-system content as untrusted data (CWE-1427 prompt-injection defense), enforces least privilege, per-target isolation, secrets-scan before every commit, draft-PRs-only, never force-push or auto-merge, and reuses sanctioned
ghauth rather than injecting credentials.
Verdict
No must-fix errors. Two optional stylistic suggestions on SKILL.md (name a representative example; when_to_use deprecation). The force-multiplier addition meets all structural, versioning, changelog, referenced-file, skill-quality, and security requirements. Approved.
Note: scripts/validate-plugin-structure.sh and scripts/validate-marketplace.sh could not be executed in this sandbox (require approval); their checks were performed manually via file inspection and confirmed passing.
| @@ -0,0 +1,46 @@ | |||
| # Example — Migrate the fleet from npm to pnpm | |||
There was a problem hiding this comment.
❓ Did you use https://bitwarden.atlassian.net/browse/PM-35701?focusedCommentId=157547 at all? That was my attempt and helping show a prompt.
There was a problem hiding this comment.
No, I did not. I pointed Claude at https://pnpm.io/ website and let it go. I had Claude put a note here in the example. Perhaps in a follow-up commit I'll have it go through the commment e2e and see what comes of it.
# Conflicts: # README.md
…creator + evals looping.
🎟️ Tracking
N/A
📔 Objective
Adds
force-multipliertobitwarden-delivery-tools— a skill for applying one intent across many targets as N consistent, idempotent, SSH-signed draft PRs across the Bitwarden GitHub enterprise.The central design principle: the checks are the product. Making the edit is trivial. Making 74 edits provably correct, consistent, and reversible — and being able to prove what was done — is the actual hard part. The skill is built around that proof.
What ships:
SKILL.md— a lean 71-line playbook built on a fixed pipeline:SELECT → CHECK YOURSELF → PILOT → FAN-OUT → REPORT → REMEDIATE. The skill never freestyles across the fleet; it compiles any user intent into a structured campaign spec and executes it deterministically.references/campaign-spec.md— generic campaign schema: intent + target selector + recipe + validation + PR spec + safety policyreferences/finding-targets.md— discovery library keyed by signal type (file-existence, content match, manifest inspection, topic/language filter), grounded ingh ... --owner bitwardenreferences/pipeline.md— per-stage mechanics: shallow-clone isolation, validation gates, idempotency rules, rate-limit handling, dry-run semanticsreferences/safety-and-self-checks.md— the full checks-and-balances model: three reality-check brackets (before / on one / after), the mandatory pilot gate, the per-target second pass, the reference-check pre-step for destructive recipesexamples/workflow-edit.md,examples/npm-to-pnpm.md,examples/settings-json-patch.md— fully-worked campaigns; named-change specifics live here, never in the skill bodyThe skill is deliberately generic — a compiled engine, not a catalogue. It composes with the existing delivery skills rather than reinventing them:
perform-preflight → committing-changes → labeling-changes → creating-pull-request, per target, in that order.Plugin README updated to include
force-multiplierin the Mechanics skill table.🧪 Testing
Executed end-to-end against a live AppSec remediation: VULN-649 —
[VULN-649] ci: Remove deprecated scan workflow..github/workflows/scan.ymlvia per-repo contents API (not flaky code search)bitwarden/admin: branch cut,git rm, full diff read line by line, PR spec locked before any fan-outgit rm, second-pass diff-shape check (exactly one deleted file — anything more stops that target), secrets-scan, commit, draft PRMick Letofsky <mletofsky@bitwarden.com>via Secure Enclave, no Claude attributionTotal human input after confirming the pilot diff: one thumb on Touch ID per commit.