Test engineer plugin by nthompson-bitwarden · Pull Request #150 · bitwarden/ai-plugins

nthompson-bitwarden · 2026-06-22T21:25:10Z

🎟️ Tracking

https://bitwarden.atlassian.net/browse/QA-1983

📔 Objective

This branch introduces a new bitwarden-test-engineer plugin to the marketplace -a test-engineering toolkit whose first role is the test-strategist agent.

Given a change (feature, bugfix, refactor, or migration), the agent recommends what to test, at which layer, and why, shaped to each repo's actual test practice. It does test planning only - it does not author, run, or maintain tests.

Registration - new entry in .claude-plugin/marketplace.json, plugin.json, CHANGELOG.md, README, and cspell additions.
One agent (test-strategist) - classifies inputs (Jira ticket via Atlassian MCP, GitHub PR via gh, test-case CSV, or plain description), fans out subagents to gather evidence, then runs two skills in sequence.
Two skills:
- assessing-test-coverage - backward-looking inventory of what's already tested, buckets tests by layer, cites each as a GitHub permalink, flags gaps, emits an HTML coverage report.
- analyzing-test-stack - maps each behavior to the cheapest sufficient test layer (unit/integration/E2E) per repo's shape, names concrete tooling, surfaces shape-wrong tests (ice-cream-cone, over-testing), emits an HTML report.
Supporting references/scripts - report templates, CSS/style tokens, monorepo-layout and test-layer guidance, severity-risk model, and a build-report.sh script.

Key design principle: each behavior is tested at the cheapest layer that buys the needed confidence, with layer weighting decided per repo (unit-heavy pyramid for server/clients, integration/snapshot trophy for ios, all-E2E for the dedicated private test repo). Atlassian integration is optional with graceful degradation.

…rcing trophy

github-actions · 2026-06-22T21:25:33Z

Plugin Validation Summary — `bitwarden-test-engineer` (PR #150)

Overall result: ✅ PASS. All three validation passes succeeded with zero critical and zero major issues. Only minor, advisory warnings remain — none blocking.

Validation	Tool	Result
Plugin structure & manifest	plugin-validator (plugin-dev)	✅ Pass
Skill quality	skill-reviewer (plugin-dev)	✅ Pass
Security	reviewing-claude-config (claude-config-validator)	✅ Pass

Components validated: 1 agent (test-strategist), 2 skills (analyzing-test-stack, assessing-test-coverage), shared references/ + scripts/, manifest, README, and CHANGELOG. Version 1.0.0 is consistent across plugin.json, marketplace.json, and agents/test-strategist.md.

Note: the repo's validate-plugin-structure.sh / validate-marketplace.sh scripts were not run in this environment (require command approval). Their checks — manifest fields, frontmatter, version/name consistency, changelog format — were instead covered by the plugin-validator agent below.

1. Plugin Structure (plugin-validator) — ✅ PASS

Manifest (.claude-plugin/plugin.json): valid JSON; name kebab-case; version valid semver 1.0.0; description, author, homepage, repository, keywords well-formed. No invalid/unknown fields.
Agent (agents/test-strategist.md): name test-strategist (lowercase/hyphenated, within 3–50); description carries 4 well-formed <example> blocks with Context/user/assistant/<commentary>; model: inherit valid; color: green valid; substantial system prompt; version: 1.0.0 present and consistent.
Skills: both SKILL.md files have valid frontmatter, names matching their directories, and scoped allowed-tools. All references/ files resolve.
Hooks / MCP: none defined (correct). The plugin consumes bitwarden-atlassian-tools' MCP server via mcp__plugin_bitwarden-atlassian-tools_* tool references, documented as an optional dependency with graceful degradation.
Cross-references: all support files resolve (input-sources.md, report-style-tokens.md, report-style.css, report-template-common.md, all skill references/, and scripts/build-report.sh — executable 0755).
CHANGELOG.md: Keep a Changelog format, references SemVer, single ## [1.0.0] - 2026-06-15 entry under ### Added, version matches manifest.
File organization: README present and comprehensive; no stray files (.DS_Store, node_modules, .env, etc.).

Minor / advisory (no action required):

agents/test-strategist.md — grants Write, Task, and scoped Bash(...). Appropriate for an orchestrator that fans out subagents and runs build-report.sh; every Bash entry is tightly scoped (e.g. Bash(gh pr view:*), Bash(${CLAUDE_PLUGIN_ROOT}/scripts/build-report.sh:*)), no blanket Bash. Flagged only so a reviewer confirms Write is intended (it is).
agents/test-strategist.md filename — the file is test-strategist.md, not AGENT.md. Auto-discovery accepts any *.md under agents/, so this is valid; however bump-plugin-version.sh targets AGENT.md (uppercase), so manually confirm the agent version field on the next version bump.

2. Skill Review (skill-reviewer) — ✅ PASS (both skills)

Both skills pass the core quality bar. Frontmatter valid; descriptions are third-person/imperative with specific trigger phrases and exemplary mutual disambiguation ("does NOT … for that, use the other skill") preventing the forward/backward-looking skills from cross-triggering. Progressive disclosure is textbook — lean cores, detail offloaded to references/, shared plugin-level references/ reused across both skills. All referenced files and named section anchors resolve. allowed-tools tightly scoped (read-only git/gh + the single build script).

Minor / warnings (should-fix / optional):

skills/assessing-test-coverage/references/finding-coverage.md:20,40 — bare relative reference to references/monorepo-layout.md, which actually lives in the sibling analyzing-test-stack skill. A reader following that path from assessing-test-coverage/references/ would not find it. Recommend qualifying it the way assessing-test-coverage/SKILL.md:27 does ("the analyzing-test-stack skill's references/monorepo-layout.md"). This is the only finding with real reader-confusion risk.
analyzing-test-stack/SKILL.md:3 (~660 chars) and assessing-test-coverage/SKILL.md:3 (~580 chars) — descriptions exceed the ~500-char guideline. Length is earned by disambiguation; trim or accept as a deliberate trade-off.
Body word counts (802 / 712) are below the 1,000-word soft target — acceptable given strong progressive disclosure.
Neither skill has an examples/ directory; a sample generated HTML report would aid the model (optional enhancement).

3. Security Review (reviewing-claude-config) — ✅ PASS

Secrets: no API keys, tokens, passwords, private keys, or hardcoded credentials in any changed file. (All "token" hits across the plugin refer to CSS style tokens and LLM token cost/pool — benign.)
No settings.local.json committed; no settings files in the changeset.
scripts/build-report.sh: hardened — set -euo pipefail; all inputs validated (--kind allowlist, --slug regex, --date regex, file-existence checks); CSS path resolved via BASH_SOURCE (portable, no hardcoded absolute paths); CSS spliced via awk line-by-line so no shell escaping touches content. No curl/wget/eval/nc//dev/tcp/base64 -d/rm -rf or other network or destructive operations.
Permission scoping: agent tools are least-privilege — all Bash(...) entries are command-prefix-scoped, the build script is referenced via ${CLAUDE_PLUGIN_ROOT} (portable), and MCP access is limited to the specific Atlassian read tools. No dangerous auto-approvals, no overly broad file access.
URLs: all HTTPS; no insecure http:// or WS endpoints.

Recommended actions (none blocking)

Should-fix: qualify the monorepo-layout.md references in skills/assessing-test-coverage/references/finding-coverage.md:20,40 to name the sibling skill.
Optional: trim the two skill descriptions toward ~500 chars; add examples/ with a sample report.
Reminder: on the next version bump, manually verify the version field in agents/test-strategist.md (the bump script targets AGENT.md, not this filename).

Verdict: APPROVE. No errors (must-fix) found; remaining items are warnings/optional.

github-actions · 2026-06-23T12:45:59Z

🤖 Bitwarden Claude Code Review

Overall Assessment: APPROVE

Reviewed the new bitwarden-test-engineer plugin: the test-strategist agent, two skills (assessing-test-coverage, analyzing-test-stack), shared reference docs, the report-style.css visual system, and the build-report.sh assembly script. Traced the build script's argument validation, CSS-splice, and combined-mode region/id-namespacing logic — the only executable code in the PR — and found it sound. Marketplace, plugin.json, README, and CHANGELOG are consistent at 1.0.0, and CI plugin validation reported PASS.

Code Review Details

No blocking findings.

Notes considered and dismissed:

The agent declares tools: as a YAML list with scoped Bash(...) patterns (e.g. Bash(gh pr view:*)). This matches the existing bitwarden-code-reviewer agent, which uses the same scoped-Bash form, so it is a supported, established pattern — not a defect.
build-report.sh validates --slug and --date against strict regexes before using them in paths, splices CSS via awk (no shell escaping of CSS), and gates combined-mode extraction on <body> to avoid matching element names inside CSS comments. No injection or correctness issue found.

withinfocus · 2026-06-23T14:24:33Z

Is there viability to factoring these in as skills for the existing software engineer? We want to unify test development here.

nthompson-bitwarden · 2026-06-23T15:48:46Z

@withinfocus

Is there viability to factoring these in as skills for the existing software engineer? We want to unify test development here.

I think that's possible but seems to go against the persona-per-plugin pattern. The two skills included don't necessarily need to be a part of this persona, but I think it fits pretty well. Note that these two skills are explicitly not targeting test development, but planning/strategy.

Software-engineer's description states, "collaborates with QA on testing questions" - so I could see it being a consumer or peer of testing, rather than it's owner. I don't think the reports these skills generate should be limited to software-engineers, either. A product-analyst could potentially make use of them, or any role that doesn't have or need to have the software-engineer plugin installed.

For test development specifically - I would think those Skill definitions would exist alongside each repo.

# Conflicts: # .cspell.json

nthompson-bitwarden · 2026-06-27T21:18:30Z

closing in favor of smaller iteration #152

nthompson-bitwarden added 11 commits June 16, 2026 10:39

add test-engineer-plugin with test stack analyst skills

25eea80

Merge branch 'main' into test-engineer-plugin

bebf863

split stack analysis from coverage report

471fd43

html report changes, defer to current test stack shape rather than fo…

be2db8d

…rcing trophy

clean up consistency issues

736b8a5

deduplication pass

0ca9c42

refine scopes, multiple debloat passes

f605de3

Merge branch 'main' into test-engineer-plugin

29ef658

update atlassian tool mcp refs

94d5698

make examples more generic

3e4a51a

update README.md

1fbffb8

resolve should-fix review comments

6336832

nthompson-bitwarden marked this pull request as ready for review June 23, 2026 12:45

nthompson-bitwarden requested a review from a team as a code owner June 23, 2026 12:45

Merge branch 'main' into test-engineer-plugin

4c946b8

# Conflicts: # .cspell.json

nthompson-bitwarden mentioned this pull request Jun 25, 2026

Test engineer plugin current coverage #152

Draft

nthompson-bitwarden closed this Jun 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Test engineer plugin#150

Test engineer plugin#150
nthompson-bitwarden wants to merge 13 commits into
mainfrom
test-engineer-plugin

nthompson-bitwarden commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

withinfocus commented Jun 23, 2026

Uh oh!

nthompson-bitwarden commented Jun 23, 2026

Uh oh!

nthompson-bitwarden commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

nthompson-bitwarden commented Jun 22, 2026

🎟️ Tracking

📔 Objective

Uh oh!

github-actions Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Plugin Validation Summary — bitwarden-test-engineer (PR #150)

1. Plugin Structure (plugin-validator) — ✅ PASS

2. Skill Review (skill-reviewer) — ✅ PASS (both skills)

3. Security Review (reviewing-claude-config) — ✅ PASS

Recommended actions (none blocking)

Uh oh!

github-actions Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 Bitwarden Claude Code Review

Uh oh!

withinfocus commented Jun 23, 2026

Uh oh!

nthompson-bitwarden commented Jun 23, 2026

Uh oh!

nthompson-bitwarden commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Jun 22, 2026 •

edited

Loading

Plugin Validation Summary — `bitwarden-test-engineer` (PR #150)

github-actions Bot commented Jun 23, 2026 •

edited

Loading