OCPEDGE-2727: Add eval harness configs for cluster-diagnostic and threat-model skills by dhensel-rh · Pull Request #189 · openshift-eng/edge-tooling

dhensel-rh · 2026-06-11T18:41:27Z

Summary

Add eval configs for cluster-diagnostic and threat-model:tnf skills
Includes scenarios for validate, recovery-guide, and game modes (cluster-diagnostic) and PR analysis (threat-model-tnf)
Judges: severity classification, warning classification, game mode scoring, forbidden recommendations, procedure completeness, knowledge base accuracy, budget check
Sets REPORT_DIR in threat-model-tnf eval so reports write to the eval workspace
README frames evals as quality scoring, not testing

Depends on #188.
Replaces #178 (could not reopen after force-push).

Test plan

/eval-run --model claude-opus-4-6 --config evals/cluster-diagnostic.yaml
/eval-run --model claude-opus-4-6 --config evals/threat-model-tnf.yaml
Verify all file-based judges pass (reports written to workspace reports/ dir)
Run npx markdownlint-cli2 '**/*.md' to confirm no lint violations

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…templates Rename skill directories from *-threat-model to shorter names (e.g., tnf-threat-model → tnf) and extract duplicated report templates into a shared references/report-templates.md, reducing SKILL.md sizes by ~756 lines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Allow the report output directory to be set externally via $REPORT_DIR, skipping workspace discovery for report path. Enables eval harnesses and CI to control where reports are written. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…eat-model skills Add evaluation configs, test cases, and README for two skills: - cluster-diagnostic: 5 cases covering validate and recovery-guide modes - threat-model-tnf: 5 cases covering PR security analysis Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The eval harness resolves dataset.path from the repo root, not relative to the config file. Both configs were using short relative paths that broke when running from different working directories. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add case-006-game-quiz with quiz mode test case and answers - Add warning_classification judge for expected WARNING findings - Add game_mode_scoring judge for rating/score validation - Fix forbidden_recommendations to check 'shutdown -h' (not 'shutdown -h 1') - Update severity_classification description for clarity - Drop models.skill default (let CLI --model flag control it) - Simplify schema note to only exclude diagnose mode Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add execution.env.REPORT_DIR to threat-model-tnf.yaml so reports are written to the eval workspace instead of external paths - Reframe README: scoring not testing, scenarios not test cases - Update cluster-diagnostic case count to 6 (game mode added) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openshift-ci · 2026-06-11T18:41:37Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dhensel-rh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [dhensel-rh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-06-11T18:41:39Z

Walkthrough

This PR introduces a comprehensive threat modeling plugin for Claude Code that enables STRIDE/DFD threat analysis across TNF, TNA, SNO, and LVMS OpenShift edge topologies. It includes topology-specific skills with ShellCheck integration, MITRE ATT&CK and OWASP mapping, evaluation harnesses, and test cases for both cluster-diagnostic and threat-model-tnf skills.

Changes

Threat Model Plugin Infrastructure & Shared References

Layer / File(s)	Summary
Plugin registration and documentation `.claude/skills/threat-model-*`, `plugins/threat-model/.claude-plugin/plugin.json`, `plugins/threat-model/README.md`	Four symlinks in `.claude/skills/` point to topology-specific skill implementations. Plugin manifest defines metadata (name, version, description, author, keywords). README documents plugin capabilities, invocation syntax, workspace requirements, dependencies (ShellCheck, gh CLI), and a reference table of bundled artifacts.
Shared threat analysis reference materials `plugins/threat-model/references/*`	MITRE ATT&CK quick reference (tactic tables, platform-specific technique sections for TNF/TNA/SNO/LVMS, DFD element mappings). OWASP Top 10:2025 reference with pattern-to-category mappings and component-specific mappings per topology. Standardized threat model and vulnerability report templates. MITRE findings tracker template with append-only protocol for per-skill findings files.

Threat Model Skills for Each Topology

Layer / File(s)	Summary
TNF threat model skill `plugins/threat-model/skills/tnf/SKILL.md`, `plugins/threat-model/skills/tnf/dfd-elements-tnf.md`	Complete TWO-Node Fencing threat analysis skill. Documents workspace discovery, PR input parsing (PR number/URL/explicit repo+PR), ShellCheck integration, security pattern detection, TNF DFD element mapping (8 processes, 5 data stores, 12 data flows, 3 external entities, 6 trust boundaries), STRIDE evaluation per element, cross-referencing against formal threat model PE-* IDs, MITRE/OWASP mapping, and findings tracker updates. Includes credential flow path diagram and high-risk element listings.
TNA threat model skill `plugins/threat-model/skills/tna/SKILL.md`, `plugins/threat-model/skills/tna/dfd-elements-tna.md`	Complete Two-Node Arbiter threat analysis skill. Workspace discovery, input parsing, ShellCheck scanning, DFD element mapping (TNA-specific namespace with mapping tables), STRIDE per-element guidance, cross-referencing against formal `TNA-THREAT-MODEL.md` when present, MITRE/OWASP mapping, and tracker updates. Documents high-risk elements (MCO, etcd, worker kubelet) and explicitly lists TNF components absent from TNA.
SNO threat model skill `plugins/threat-model/skills/sno/SKILL.md`, `plugins/threat-model/skills/sno/dfd-elements-sno.md`	Complete Single Node OpenShift threat analysis skill. Workspace discovery, input parsing, ShellCheck integration, DFD element mapping, STRIDE evaluation, developer tooling risk analysis (disabled TLS, plaintext token caching, environment variable exposure, root containers, unpinned images), cross-referencing against formal threat model, MITRE/OWASP mapping, and findings tracker updates.
LVMS threat model skill `plugins/threat-model/skills/lvms/SKILL.md`, `plugins/threat-model/skills/lvms/dfd-elements-lvms.md`	Complete LVM Storage threat analysis skill with full procedural documentation. Workspace discovery, PR input parsing, ShellCheck, security pattern detection, MITRE/OWASP mapping, and findings tracking. DFD element catalog marked as "not yet defined" with placeholder structure (processes, data stores, data flows, trust boundaries, external entities) for future modeling work.

Evaluation Harnesses & Test Cases

Layer / File(s)	Summary
cluster-diagnostic evaluation `plugins/two-node/evals/README.md`, `plugins/two-node/evals/cluster-diagnostic.md`, `plugins/two-node/evals/cluster-diagnostic.yaml`, `plugins/two-node/evals/cluster-diagnostic/cases/*`	Evaluation harness for two-node cluster-diagnostic skill across validate/recovery-guide/game modes. Config defines judges for budget, severity/blocker classification, warning classification, procedure completeness (bash markers, verification language, parameter placeholders), forbidden recommendations detection, game mode scoring (Novice/Operator/Expert/TNF Master ratings), and knowledge-base accuracy (LLM-evaluated). Six test cases: sequential node shutdown (validate+reject), safe Redfish shutdown (validate), full shutdown recovery, standby recovery, pcs-based standby validation (validate+reject), and TNF knowledge quiz (game mode).
threat-model-tnf evaluation `plugins/two-node/evals/threat-model-tnf.md`, `plugins/two-node/evals/threat-model-tnf.yaml`, `plugins/two-node/evals/threat-model-tnf/cases/*`	Evaluation harness for threat-model-tnf skill. Config defines judges for budget, report existence, required markdown sections, DFD element reference presence (via regex), STRIDE matrix population (X/~/−/N/A markers), MITRE technique ID presence (T####), threat-analysis quality (LLM rubric with 1–5 scoring), and findings-tracker update verification. Five test cases: shell-script kubeconfig access to podman-etcd (shell scripts, trust boundary crossing, P7/DS5/DF11, T1552/T1078/T1005), credential rotation script (Critical/High/Medium, P5/DS2/DS3/DF4/DF7, T1552/T1059/T1529), MAC-address-based fencing lookup (High/Medium, P5/DS2/DF4, T1552/T1078), trivial nfsserver indentation fix (no findings expected), and TNF retry bugfix for dual-replica gating (Medium/Low, P4/P2, T1499).

Sequence Diagram

sequenceDiagram
  participant User
  participant ThreatModelSkill
  participant ShellCheck
  participant PRAnalysis
  participant DFDMapper
  participant MITREMapper
  participant ReportGen
  User->>ThreatModelSkill: /threat-model:tnf PR_URL
  ThreatModelSkill->>ThreatModelSkill: discover workspace, resolve org/repo/PR
  ThreatModelSkill->>ThreatModelSkill: fetch PR diff via gh CLI
  ThreatModelSkill->>ShellCheck: scan shell scripts in diff
  ShellCheck-->>ThreatModelSkill: issues with MITRE mappings
  ThreatModelSkill->>PRAnalysis: detect security patterns (injection, creds, etc)
  PRAnalysis-->>ThreatModelSkill: pattern categories + severities
  ThreatModelSkill->>DFDMapper: map changes to DFD elements (P1-P8, DS1-DS5, etc)
  DFDMapper-->>ThreatModelSkill: element mappings with STRIDE per-element
  ThreatModelSkill->>MITREMapper: cross-reference elements to MITRE techniques
  MITREMapper-->>ThreatModelSkill: technique IDs + OWASP categories
  ThreatModelSkill->>ReportGen: generate threat-model markdown report
  ReportGen-->>ThreatModelSkill: report with sections, tables, findings
  ThreatModelSkill->>ThreatModelSkill: append to MITRE findings tracker
  ThreatModelSkill-->>User: threat-model report + updated tracker

🎯 3 (Moderate) | ⏱️ ~25 minutes

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

⚔️ Resolve merge conflicts

Resolve merge conflict in branch eval-configs

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-06-11T18:41:45Z

@dhensel-rh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/images	`d5874c4`	link	true	`/test images`
ci/prow/markdownlint	`d5874c4`	link	true	`/test markdownlint`
ci/prow/skills-lint	`d5874c4`	link	true	`/test skills-lint`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

coderabbitai

Actionable comments posted: 11

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/threat-model/README.md`:
- Around line 57-78: Update the workspace layout example in README.md to reflect
the real .claude/skills structure: replace the suggested
.claude/skills/threat-model/ subdirectory with four top-level entries (symlinks)
directly under .claude/skills/ (including "threat-model" as a top-level symlink
and the three mitre-findings-*.md files), and adjust the tree diagram and
filenames (e.g., "threat-model", "mitre-findings-tnf.md",
"mitre-findings-tna.md", "mitre-findings-sno.md", "mitre-findings-lvms.md") so
users are pointed to the correct path.

In `@plugins/threat-model/references/owasp-reference.md`:
- Around line 24-96: The ATT&CK technique IDs in the OWASP crosswalk are
incorrect and must be corrected: review each mapping row (e.g., the entries
labeled "Path traversal", "SSRF", "Insecure Deserialization", "Logging Sensitive
Data", "Master schedulable" and any other mismatched rows in the "Pattern to
OWASP Mapping" and the TNF/TNA/SNO tables) against MITRE ATT&CK and replace the
wrong T-codes with the proper techniques; ensure the MITRE column for the rows
referenced in the diff (those exact labels) uses the canonical attack IDs from
attack.mitre.org and update any dependent PE-* or DFD notes if they rely on the
old technique semantics so downstream TNF/TNA/SNO consumers get consistent
mappings.

In `@plugins/threat-model/references/report-templates.md`:
- Around line 8-12: Filename patterns PR<number>-THREAT-MODEL-<repo>.md and
VULN-PR<number>-<short-desc>.md are not unique across topologies/skills; update
the naming templates to include topology and skill identifiers so reports can't
overwrite each other. Change the templates to something like
PR<number>-THREAT-MODEL-<repo>-<topology>-<skill>.md and
VULN-PR<number>-<short-desc>-<topology>-<skill>.md (or equivalent tokens used by
your generator), and update any code that emits these filenames to populate the
<topology> and <skill> tokens from the current run context (e.g., the same
variables used to tag runs).

In `@plugins/threat-model/skills/tna/SKILL.md`:
- Line 208: The SC2164 entry in SKILL.md contains a duplicated word "without" in
its description; edit the SC2164 line (the table cell containing "cd without
without error-exit guard - path traversal risk") to remove the extra "without"
so it reads "cd without error-exit guard - path traversal risk" (update the
SC2164 description text accordingly).

In `@plugins/threat-model/skills/tnf/SKILL.md`:
- Around line 175-177: The workflow text in SKILL.md promises OWASP Top 10
mapping but only lists a MITRE ATT&CK mapping step; either add an OWASP mapping
step or remove the claim. Fix by updating the SKILL.md steps to include a new
step (e.g., "Map findings to OWASP Top 10 (see `owasp-reference.md`)") and a
corresponding report/output line (e.g., "Generate OWASP mapping report at
`$REPORT_DIR/owasp-report.md`" and ensure the Append Protocol step mentions
including OWASP mapping in the findings block written to `$FINDINGS_FILE`), or
alternatively remove references to `owasp-reference.md` in the metadata so the
skill no longer advertises OWASP mapping.

In `@plugins/two-node/evals/cluster-diagnostic.md`:
- Around line 39-42: The "Eval scope" paragraph is stale: it states only
`validate` and `recovery-guide` are testable while the project config includes
`game` (and diagnostic) cases; update the text under the "Eval scope" heading to
accurately reflect current behavior by noting that `game` mode and `diagnose`
are present in the config but require special setup (e.g., live SSH for
`diagnose` and tool-interception or interactive AskUserQuestion handling for
`game`) and may be only partially or conditionally testable rather than entirely
out of scope; ensure you mention the mode names (`validate`, `recovery-guide`,
`diagnose`, `game`) so readers can map the wording to the config.

In `@plugins/two-node/evals/cluster-diagnostic.yaml`:
- Around line 168-201: The forbidden_recommendations check currently only looks
for "pcs node standby" and "shutdown -h" but must also detect variants
referenced in the policy; update the logic that builds recommend_sections (and
the subsequent checks over sec_lower) to also flag "pcs standby" and any
"standby" variants (e.g., "pcs standby", "standby node") and to detect
sequential-shutdown phrasing such as "sequential shutdown",
"sequential-shutdown", "shutdown nodes in sequence", or "one-by-one shutdown"
(and similar "in sequence"/"one at a time" patterns); also broaden the shutdown
match to catch forms like "shutdown -h 1", "shutdown -h now", and numbered
arguments. Keep using the same variables (conversation, ann, recommend_sections,
sec_lower, forbidden) and append appropriate forbidden messages like "pcs
standby recommended" and "sequential shutdown recommended" when these patterns
are found.
- Around line 99-115: The BLOCKER detection is too brittle: replace the
substring checks around conv_upper/has_blocker with proper word-boundary and
negation-aware matching (e.g., use a case-insensitive regex like \bBLOCKER\b and
also detect phrases like "no blocker" to avoid false positives) and change the
expected_blockers logic so all expected blockers must be present (not just any
one) — iterate expected_blockers and confirm each is found (case-insensitive,
word-boundary aware) in conversation, building a missing list and returning
failure if any expected blocker is missing; update references to conv_upper,
has_blocker, should_reject, expected_blockers, found_blockers and conversation
accordingly.
- Around line 146-167: The recovery-guide completeness check ignores the
schema's annotations.expected_scenario, so add a scenario validation: read
expected = annotations.get("expected_scenario") and actual =
outputs.get("scenario") (or derive a scenario tag from the conversation if
outputs lacks it), then add a new check like "scenario_matches": (not expected)
or (expected == actual) to the checks dict used in the procedure_completeness
logic; include that key in passed/total computation and report it in the failed
list so recovery-guide runs fail when the expected_scenario does not match the
actual scenario.

In `@plugins/two-node/evals/README.md`:
- Around line 39-75: The README.md has fenced code blocks without language
identifiers (the directory tree block and the command examples like
/eval-analyze, /eval-dataset, /eval-run, /eval-review, /eval-optimize), causing
markdownlint MD040 failures; fix by adding appropriate language tags: mark the
tree block as ```text and each command block as ```bash in
plugins/two-node/evals/README.md so the blocks become ```text for the evals/
tree and ```bash for the /eval-analyze, /eval-dataset, /eval-run, /eval-review,
and /eval-optimize command snippets.

In `@plugins/two-node/evals/threat-model-tnf.yaml`:
- Around line 149-152: The regex r'\b[XxNn/Aa~-]\b' incorrectly splits
multi-char markers like "N/A" and miscaptures standalone "~" and "-"—replace the
pattern used in the re.findall call to match whole markers and use
case-insensitive matching: change the pattern to r'\b(?:X|N/A|~|-)\b' and call
re.findall with flags=re.IGNORECASE (i.e., re.findall(r'\b(?:X|N/A|~|-)\b',
stride_section[1][:2000], flags=re.IGNORECASE)) so markers, including "N/A" and
"~"/"-", are captured as single tokens (referencing the markers variable, the
re.findall call, and stride_section).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 970f557a-09b9-4426-a24a-dc4bbc370f8f

📥 Commits

Reviewing files that changed from the base of the PR and between 63cb478 and d5874c4.

📒 Files selected for processing (46)

.claude/skills/threat-model-lvms
.claude/skills/threat-model-sno
.claude/skills/threat-model-tna
.claude/skills/threat-model-tnf
plugins/threat-model/.claude-plugin/plugin.json
plugins/threat-model/README.md
plugins/threat-model/references/mitre-findings-template.md
plugins/threat-model/references/mitre-reference.md
plugins/threat-model/references/owasp-reference.md
plugins/threat-model/references/report-templates.md
plugins/threat-model/skills/lvms/SKILL.md
plugins/threat-model/skills/lvms/dfd-elements-lvms.md
plugins/threat-model/skills/sno/SKILL.md
plugins/threat-model/skills/sno/dfd-elements-sno.md
plugins/threat-model/skills/tna/SKILL.md
plugins/threat-model/skills/tna/dfd-elements-tna.md
plugins/threat-model/skills/tnf/SKILL.md
plugins/threat-model/skills/tnf/dfd-elements-tnf.md
plugins/two-node/evals/README.md
plugins/two-node/evals/cluster-diagnostic.md
plugins/two-node/evals/cluster-diagnostic.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-001-validate-sequential-shutdown/annotations.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-001-validate-sequential-shutdown/input.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-002-validate-safe-redfish/annotations.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-002-validate-safe-redfish/input.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-003-recovery-full-shutdown/annotations.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-003-recovery-full-shutdown/input.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-004-recovery-standby/annotations.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-004-recovery-standby/input.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-005-validate-pcs-standby/annotations.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-005-validate-pcs-standby/input.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-006-game-quiz/annotations.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-006-game-quiz/answers.yaml
plugins/two-node/evals/cluster-diagnostic/cases/case-006-game-quiz/input.yaml
plugins/two-node/evals/threat-model-tnf.md
plugins/two-node/evals/threat-model-tnf.yaml
plugins/two-node/evals/threat-model-tnf/cases/case-001-shell-script-k8s-api/annotations.yaml
plugins/two-node/evals/threat-model-tnf/cases/case-001-shell-script-k8s-api/input.yaml
plugins/two-node/evals/threat-model-tnf/cases/case-002-credential-rotation-script/annotations.yaml
plugins/two-node/evals/threat-model-tnf/cases/case-002-credential-rotation-script/input.yaml
plugins/two-node/evals/threat-model-tnf/cases/case-003-mac-fencing-lookup/annotations.yaml
plugins/two-node/evals/threat-model-tnf/cases/case-003-mac-fencing-lookup/input.yaml
plugins/two-node/evals/threat-model-tnf/cases/case-004-trivial-indentation-fix/annotations.yaml
plugins/two-node/evals/threat-model-tnf/cases/case-004-trivial-indentation-fix/input.yaml
plugins/two-node/evals/threat-model-tnf/cases/case-005-tnf-retry-bugfix/annotations.yaml
plugins/two-node/evals/threat-model-tnf/cases/case-005-tnf-retry-bugfix/input.yaml

coderabbitai · 2026-06-11T18:54:25Z

+### Recommended workspace layout
+
+```text
+your-workspace/
+├── repos/
+│   ├── cluster-etcd-operator/
+│   ├── installer/
+│   ├── machine-config-operator/
+│   ├── resource-agents/
+│   ├── two-node-toolbox/
+│   │   └── docs/
+│   │       ├── TNF-THREAT-MODEL.md
+│   │       └── TNA-THREAT-MODEL.md
+│   └── ...
+└── .claude/
+    └── skills/
+        ├── threat-model/
+        ├── mitre-findings-tnf.md  # Created automatically on first use
+        ├── mitre-findings-tna.md
+        ├── mitre-findings-sno.md
+        └── mitre-findings-lvms.md
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Show the actual .claude/skills layout here.

The diagram currently suggests a .claude/skills/threat-model/ subdirectory, but the plugin is installed as four top-level symlinks directly under .claude/skills/. As written, this example points users at the wrong path.

🔧 Suggested fix

└── .claude/ └── skills/ - ├── threat-model/ + ├── threat-model-tnf + ├── threat-model-tna + ├── threat-model-sno + ├── threat-model-lvms ├── mitre-findings-tnf.md # Created automatically on first use ├── mitre-findings-tna.md ├── mitre-findings-sno.md └── mitre-findings-lvms.md

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/threat-model/README.md` around lines 57 - 78, Update the workspace layout example in README.md to reflect the real .claude/skills structure: replace the suggested .claude/skills/threat-model/ subdirectory with four top-level entries (symlinks) directly under .claude/skills/ (including "threat-model" as a top-level symlink and the three mitre-findings-*.md files), and adjust the tree diagram and filenames (e.g., "threat-model", "mitre-findings-tnf.md", "mitre-findings-tna.md", "mitre-findings-sno.md", "mitre-findings-lvms.md") so users are pointed to the correct path.

coderabbitai · 2026-06-11T18:54:25Z

+## Pattern to OWASP Mapping
+
+| Security Pattern | OWASP | MITRE | CWE |
+|-----------------|-------|-------|-----|
+| **Command Injection** | A05 | T1059 | CWE-78 |
+| Shell exec with unsanitized input | A05 | T1059 | CWE-78 |
+| fmt.Sprintf() building shell commands | A05 | T1059 | CWE-78 |
+| **Hardcoded Credentials** | A07 | T1552 | CWE-798 |
+| Passwords in source code | A07 | T1552 | CWE-798 |
+| API keys in config files | A07 | T1552 | CWE-798 |
+| **Broken Access Control** | A01 | T1078 | CWE-284 |
+| Missing authorization checks | A01 | T1078 | CWE-285 |
+| Path traversal | A01 | T1083 | CWE-22 |
+| SSRF | A01 | T1046 | CWE-918 |
+| **Cryptographic Failures** | A04 | T1573 | CWE-327 |
+| Weak algorithms (MD5, SHA1) | A04 | T1573 | CWE-328 |
+| Disabled TLS verification | A04 | T1557 | CWE-295 |
+| InsecureSkipVerify = true | A04 | T1557 | CWE-295 |
+| **Security Misconfiguration** | A02 | T1562 | CWE-16 |
+| Debug mode in production | A02 | T1562 | CWE-489 |
+| Privileged containers | A02 | T1611 | CWE-250 |
+| **Insecure Deserialization** | A08 | T1059 | CWE-502 |
+| pickle.loads(), yaml.load() | A08 | T1059 | CWE-502 |
+| **Logging Sensitive Data** | A09 | T1005 | CWE-532 |
+| Credentials in logs | A09 | T1005 | CWE-532 |
+| **Missing Error Handling** | A10 | - | CWE-754 |
+| Unchecked error returns | A10 | - | CWE-252 |
+| Fail-open logic | A10 | T1562 | CWE-636 |
+
+---
+
+## TNF-Specific OWASP Mappings
+
+| TNF Component | Risk | OWASP | MITRE | CWE | DFD Elements | PE-* IDs |
+|---------------|------|-------|-------|-----|--------------|----------|
+| BMC credentials in install-config | Hardcoded secrets | A07 | T1552 | CWE-798 | P1, DS1, DF1, DF2 | PE-P1-I-1, PE-DS1-I-1 |
+| BMC password in shell command | Command injection | A05 | T1059 | CWE-78 | P5, DF9 | PE-P5-T-1, PE-P5-I-1 |
+| Credentials in CIB XML | Plaintext storage | A04 | T1552 | CWE-312 | DS3, DF7 | PE-DS3-I-1, PE-DF7-I-1 |
+| InsecureSkipVerify on BMC | Crypto failure | A04 | T1557 | CWE-295 | P8, DF10 | PE-P8-S-1, PE-DF10-T-1 |
+| Privileged TNF setup pods | Misconfiguration | A02 | T1611 | CWE-250 | P3, P4, P5 | PE-P4-E-1, PE-P5-E-1 |
+| fencing-credentials Secret | Access control | A01 | T1552 | CWE-284 | DS2, DF4 | PE-DS2-I-1, PE-DS2-T-1 |
+| Corosync unencrypted | Crypto failure | A04 | T1557 | CWE-319 | EE3, DF12 | PE-EE3-S-1 |
+| PCS token generation | Auth weakness | A07 | T1078 | CWE-330 | P3, DS4, DF5 | PE-P3-S-1, PE-DS4-I-1 |
+| Credentials in CLI args | Info exposure | A07 | T1552 | CWE-214 | P6, P8, DF9 | PE-DF9-I-1, PE-P8-I-1 |
+| No fencing audit trail | Logging failure | A09 | - | CWE-778 | P5, P6 | PE-P5-R-1, PE-P1-R-1 |
+
+---
+
+## TNA-Specific OWASP Mappings
+
+| TNA Component | Risk | OWASP | MITRE | CWE | DFD Elements | PE-* IDs |
+|---------------|------|-------|-------|-----|--------------|----------|
+| Arbiter taint as sole scheduling protection | Misconfiguration | A02 | T1562 | CWE-250 | TNA-P3 | PE-TNA-P3-T-1 |
+| Worker ignition token | Credential exposure | A07 | T1552 | CWE-798 | TNA-DS6 | PE-TNA-DS6-I-1 |
+| Worker lateral movement to control plane | Access control | A01 | T1021 | CWE-284 | TNA-P5, TNA-DS5 | PE-TNA-P5-E-1 |
+| etcd data on compromised node | Crypto failure | A04 | T1552 | CWE-312 | TNA-DS5 | PE-TNA-DS5-I-1 |
+| Rogue worker CSR approval | Auth failure | A07 | T1078 | CWE-287 | TNA-P5, TNA-DS6 | PE-TNA-P5-S-1 |
+| No arbiter taint drift alert | Logging failure | A09 | - | CWE-778 | TNA-P3 | PE-TNA-P3-T-1 |
+
+---
+
+## SNO-Specific OWASP Mappings
+
+| SNO Component | Risk | OWASP | MITRE | CWE | DFD Elements | PE-* IDs |
+|---------------|------|-------|-------|-----|--------------|----------|
+| install-config with pull secret + offline token | Credential exposure | A07 | T1552 | CWE-798 | SNO-DS1 | PE-SNO-DS1-I-1 |
+| Single-member etcd (no quorum) | Data loss / total compromise | A06 | T1485 | CWE-312 | SNO-DS3 | PE-SNO-DS3-I-1, PE-SNO-DS3-D-1 |
+| UnsafeScalingStrategy bypasses quorum checks | Insecure design | A06 | T1562 | CWE-636 | SNO-P4 | PE-SNO-P4-D-1 |
+| Bootstrap-in-place runs privileged on bare metal | Misconfiguration | A02 | T1611 | CWE-250 | SNO-P5 | PE-SNO-P5-E-1 |
+| Master schedulable (workloads on control plane) | Access control | A01 | T1610 | CWE-284 | SNO-P6 | PE-SNO-P6-E-1 |
+| Kubeconfig + kubeadmin-password on admin workstation | Credential exposure | A07 | T1552 | CWE-522 | SNO-DS4 | PE-SNO-DS4-I-1 |
+| Discovery ISO integrity | Supply chain | A03 | T1195 | CWE-494 | SNO-DS2 | PE-SNO-DS2-T-1 |
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Correct the ATT&CK crosswalk before using this as the shared reference.

Several rows pair the stated risk with unrelated ATT&CK techniques, e.g. Path traversal → T1083, SSRF → T1046, Insecure deserialization → T1059, Logging sensitive data → T1005, and Master schedulable → T1610. MITRE defines those IDs as File and Directory Discovery, Network Service Discovery, Command and Scripting Interpreter, Data from Local System, and Deploy Container, respectively. (attack.mitre.org)

Because the TNF/TNA/SNO skills consume this file as their shared OWASP mapping, these mistakes will flow straight into reports.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/threat-model/references/owasp-reference.md` around lines 24 - 96, The ATT&CK technique IDs in the OWASP crosswalk are incorrect and must be corrected: review each mapping row (e.g., the entries labeled "Path traversal", "SSRF", "Insecure Deserialization", "Logging Sensitive Data", "Master schedulable" and any other mismatched rows in the "Pattern to OWASP Mapping" and the TNF/TNA/SNO tables) against MITRE ATT&CK and replace the wrong T-codes with the proper techniques; ensure the MITRE column for the rows referenced in the diff (those exact labels) uses the canonical attack IDs from attack.mitre.org and update any dependent PE-* or DFD notes if they rely on the old technique semantics so downstream TNF/TNA/SNO consumers get consistent mappings.

coderabbitai · 2026-06-11T18:54:25Z

+## Report Naming Convention
+
+- **Full threat model**: `PR<number>-THREAT-MODEL-<repo>.md`
+- **Individual vuln**: `VULN-PR<number>-<short-desc>.md`
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include the topology/skill in the report filenames.

PR<number>-THREAT-MODEL-<repo>.md and VULN-PR<number>-<short-desc>.md are not unique across TNF/TNA/SNO/LVMS. In the shared eval workspace, two runs for the same PR/repo can overwrite each other’s reports.

Proposed fix

- **Full threat model**: `PR<number>-THREAT-MODEL-<repo>.md` - **Individual vuln**: `VULN-PR<number>-<short-desc>.md` + **Full threat model**: `PR<number>-THREAT-MODEL-<topology>-<repo>.md` + **Individual vuln**: `VULN-PR<number>-<topology>-<short-desc>.md`

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

## Report Naming Convention

- **Full threat model**: `PR<number>-THREAT-MODEL-<repo>.md`

- **Individual vuln**: `VULN-PR<number>-<short-desc>.md`

## Report Naming Convention

- **Full threat model**: `PR<number>-THREAT-MODEL-<topology>-<repo>.md`

- **Individual vuln**: `VULN-PR<number>-<topology>-<short-desc>.md`

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/threat-model/references/report-templates.md` around lines 8 - 12, Filename patterns PR<number>-THREAT-MODEL-<repo>.md and VULN-PR<number>-<short-desc>.md are not unique across topologies/skills; update the naming templates to include topology and skill identifiers so reports can't overwrite each other. Change the templates to something like PR<number>-THREAT-MODEL-<repo>-<topology>-<skill>.md and VULN-PR<number>-<short-desc>-<topology>-<skill>.md (or equivalent tokens used by your generator), and update any code that emits these filenames to populate the <topology> and <skill> tokens from the current run context (e.g., the same variables used to tag runs).

coderabbitai · 2026-06-11T18:54:26Z

+| SC2029 | Warning | ssh command with unescaped variables | T1059 |
+| SC2087 | Warning | Unquoted heredoc - variable expansion | T1059 |
+| SC2155 | Warning | Declare/assign separately to avoid masking errors | - |
+| SC2164 | Warning | cd without without error-exit guard - path traversal risk | T1083 |


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the duplicated "without" in the ShellCheck note.

This is a small but visible docs typo in the SC2164 description.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/threat-model/skills/tna/SKILL.md` at line 208, The SC2164 entry in SKILL.md contains a duplicated word "without" in its description; edit the SC2164 line (the table cell containing "cd without without error-exit guard - path traversal risk") to remove the extra "without" so it reads "cd without error-exit guard - path traversal risk" (update the SC2164 description text accordingly).

Source: Linters/SAST tools

coderabbitai · 2026-06-11T18:54:26Z

+10. **Map findings to MITRE ATT&CK** techniques (see `$PLUGIN_DIR/references/mitre-reference.md`)
+11. **Generate report** at `$REPORT_DIR/`
+12. **Append findings to tracker** — follow the Append Protocol to write a findings block to `$FINDINGS_FILE`


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add the promised OWASP mapping step.

The workflow advertises OWASP Top 10 mapping and lists owasp-reference.md, but the instructions only ever map findings to MITRE ATT&CK. As written, the OWASP part of the report will never be produced. Add an OWASP-specific step/report section or remove the OWASP claim from the skill metadata.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/threat-model/skills/tnf/SKILL.md` around lines 175 - 177, The workflow text in SKILL.md promises OWASP Top 10 mapping but only lists a MITRE ATT&CK mapping step; either add an OWASP mapping step or remove the claim. Fix by updating the SKILL.md steps to include a new step (e.g., "Map findings to OWASP Top 10 (see `owasp-reference.md`)") and a corresponding report/output line (e.g., "Generate OWASP mapping report at `$REPORT_DIR/owasp-report.md`" and ensure the Append Protocol step mentions including OWASP mapping in the findings block written to `$FINDINGS_FILE`), or alternatively remove references to `owasp-reference.md` in the metadata so the skill no longer advertises OWASP mapping.

coderabbitai · 2026-06-11T18:54:26Z

+      conv_upper = conversation.upper()
+      has_blocker = "BLOCKER" in conv_upper
+
+      if should_reject and not has_blocker:
+          return (False, "Procedure should have been rejected with BLOCKER but no BLOCKER found")
+
+      if not should_reject and has_blocker:
+          return (False, "Procedure should NOT have BLOCKER findings but BLOCKER was found")
+
+      found_blockers = []
+      for b in expected_blockers:
+          if b.lower() in conversation.lower():
+              found_blockers.append(b)
+
+      if expected_blockers and not found_blockers:
+          return (False, f"Expected blockers {expected_blockers} not found in output")
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

BLOCKER detection logic can mis-score valid/invalid outputs.

Using "BLOCKER" in conversation causes false positives (e.g., “No BLOCKER findings”), and expected blockers pass even if only one of multiple expected blockers is mentioned.

💡 Suggested patch

- conv_upper = conversation.upper() - has_blocker = "BLOCKER" in conv_upper + conv_lower = conversation.lower() + blocker_lines = [ + ln for ln in conv_lower.splitlines() + if "blocker" in ln and "no blocker" not in ln + ] + has_blocker = len(blocker_lines) > 0 @@ - found_blockers = [] - for b in expected_blockers: - if b.lower() in conversation.lower(): - found_blockers.append(b) - - if expected_blockers and not found_blockers: - return (False, f"Expected blockers {expected_blockers} not found in output") + found_blockers = [b for b in expected_blockers if b.lower() in conv_lower] + missing_blockers = [b for b in expected_blockers if b.lower() not in conv_lower] + if missing_blockers: + return (False, f"Expected blockers missing: {missing_blockers}. Found: {found_blockers}")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/two-node/evals/cluster-diagnostic.yaml` around lines 99 - 115, The BLOCKER detection is too brittle: replace the substring checks around conv_upper/has_blocker with proper word-boundary and negation-aware matching (e.g., use a case-insensitive regex like \bBLOCKER\b and also detect phrases like "no blocker" to avoid false positives) and change the expected_blockers logic so all expected blockers must be present (not just any one) — iterate expected_blockers and confirm each is found (case-insensitive, word-boundary aware) in conversation, building a missing list and returning failure if any expected blocker is missing; update references to conv_upper, has_blocker, should_reject, expected_blockers, found_blockers and conversation accordingly.

coderabbitai · 2026-06-11T18:54:26Z

+    if: "annotations.get('mode') == 'recovery-guide'"
+    check: |
+      conversation = outputs.get("conversation", "")
+
+      if not conversation:
+          return (False, "No conversation output found")
+
+      checks = {
+          "bash_commands": any(marker in conversation for marker in ["```bash", "```sh", "curl ", "pcs ", "oc "]),
+          "has_verification": any(w in conversation.lower() for w in ["verify", "confirm", "check", "poll", "wait"]),
+          "has_parameters": any(p in conversation for p in ["$BMC", "$NODE", "BMC_USER", "BMC_PASS", "BMC_HOST"]),
+      }
+
+      passed = sum(checks.values())
+      total = len(checks)
+      failed = [k for k, v in checks.items() if not v]
+
+      if passed == total:
+          return (True, f"All {total} completeness checks passed")
+      else:
+          return (False, f"{passed}/{total} checks passed. Missing: {failed}")
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

expected_scenario is defined in schema but never enforced in recovery checks.

Recovery-guide cases can pass completeness while returning the wrong scenario flow, because procedure_completeness doesn’t validate annotations.expected_scenario.

💡 Suggested patch

check: | conversation = outputs.get("conversation", "") + ann = outputs.get("annotations", {}) + expected_scenario = ann.get("expected_scenario") @@ if not conversation: return (False, "No conversation output found") + + if expected_scenario and expected_scenario.lower() not in conversation.lower(): + return (False, f"Expected scenario '{expected_scenario}' not referenced in response")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/two-node/evals/cluster-diagnostic.yaml` around lines 146 - 167, The recovery-guide completeness check ignores the schema's annotations.expected_scenario, so add a scenario validation: read expected = annotations.get("expected_scenario") and actual = outputs.get("scenario") (or derive a scenario tag from the conversation if outputs lacks it), then add a new check like "scenario_matches": (not expected) or (expected == actual) to the checks dict used in the procedure_completeness logic; include that key in passed/total computation and report it in the failed list so recovery-guide runs fail when the expected_scenario does not match the actual scenario.

coderabbitai · 2026-06-11T18:54:26Z

+  - name: forbidden_recommendations
+    description: |
+      Checks that the skill never recommends procedures that violate known
+      failure modes: pcs node standby, sequential shutdown, shutdown -h.
+    check: |
+      conversation = outputs.get("conversation", "")
+      ann = outputs.get("annotations", {})
+      mode = ann.get("mode", "")
+
+      if not conversation:
+          return (False, "No conversation output found")
+
+      # Only check recommendations sections, not quoted failure descriptions
+      forbidden = []
+
+      # Check if skill RECOMMENDS (not just mentions) dangerous procedures
+      conv_lower = conversation.lower()
+
+      recommend_sections = []
+      for marker in ["recommend", "suggested", "recovery", "procedure", "steps to"]:
+          idx = conv_lower.find(marker)
+          if idx >= 0:
+              recommend_sections.append(conversation[idx:idx+500])
+
+      for section in recommend_sections:
+          sec_lower = section.lower()
+          if "pcs node standby" in sec_lower and "never" not in sec_lower and "do not" not in sec_lower:
+              forbidden.append("pcs node standby recommended")
+          if "shutdown -h" in sec_lower and "never" not in sec_lower and "do not" not in sec_lower:
+              forbidden.append("shutdown -h 1 recommended")
+
+      if forbidden:
+          return (False, f"Forbidden recommendations found: {forbidden}")
+      return (True, "No forbidden procedures recommended")


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Forbidden-procedure guard is incomplete against stated policy.

The description forbids sequential shutdown and standby recommendations, but the implementation only checks pcs node standby and shutdown -h; it misses sequential-shutdown phrasing and standby variants like pcs standby.

💡 Suggested patch

for section in recommend_sections: sec_lower = section.lower() - if "pcs node standby" in sec_lower and "never" not in sec_lower and "do not" not in sec_lower: + if ("pcs node standby" in sec_lower or "pcs standby" in sec_lower) and "never" not in sec_lower and "do not" not in sec_lower: forbidden.append("pcs node standby recommended") + if "sequential shutdown" in sec_lower and "never" not in sec_lower and "do not" not in sec_lower: + forbidden.append("sequential shutdown recommended") if "shutdown -h" in sec_lower and "never" not in sec_lower and "do not" not in sec_lower: forbidden.append("shutdown -h 1 recommended")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/two-node/evals/cluster-diagnostic.yaml` around lines 168 - 201, The forbidden_recommendations check currently only looks for "pcs node standby" and "shutdown -h" but must also detect variants referenced in the policy; update the logic that builds recommend_sections (and the subsequent checks over sec_lower) to also flag "pcs standby" and any "standby" variants (e.g., "pcs standby", "standby node") and to detect sequential-shutdown phrasing such as "sequential shutdown", "sequential-shutdown", "shutdown nodes in sequence", or "one-by-one shutdown" (and similar "in sequence"/"one at a time" patterns); also broaden the shutdown match to catch forms like "shutdown -h 1", "shutdown -h now", and numbered arguments. Keep using the same variables (conversation, ann, recommend_sections, sec_lower, forbidden) and append appropriate forbidden messages like "pcs standby recommended" and "sequential shutdown recommended" when these patterns are found.

coderabbitai · 2026-06-11T18:54:26Z

+```
+evals/
+├── <skill-name>.yaml           # Eval config (judges, thresholds, schema)
+├── <skill-name>.md             # Cached skill analysis
+└── <skill-name>/
+    └── cases/
+        └── case-NNN-<slug>/
+            ├── input.yaml      # Scenario input
+            └── annotations.yaml # Expected outcomes
+```
+
+## Adding a New Eval
+
+1. **Analyze the skill** — reads SKILL.md, designs judges, writes the eval config
+   ```
+   /eval-analyze --skill <name> --config evals/<name>.yaml
+   ```
+
+2. **Generate scenarios** — creates `input.yaml` + `annotations.yaml` per case
+   ```
+   /eval-dataset --config evals/<name>.yaml
+   ```
+
+3. **Run the eval** — executes the skill against each case, scores with judges, generates HTML report
+   ```
+   /eval-run --model claude-opus-4-6 --config evals/<name>.yaml
+   ```
+
+4. **Review results** — walk through cases, collect human feedback
+   ```
+   /eval-review --run-id <run-id> --config evals/<name>.yaml
+   ```
+
+5. **(Optional) Optimize** — auto-fix SKILL.md based on judge failures, re-run to verify
+   ```
+   /eval-optimize --config evals/<name>.yaml
+   ```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language identifiers to fenced blocks to satisfy markdownlint consistently.

Several fenced blocks omit a language tag (directory tree + command snippets), which triggers MD040 and can break lint-gated CI.

💡 Suggested patch

-``` +```text evals/ ├── <skill-name>.yaml # Eval config (judges, thresholds, schema) ├── <skill-name>.md # Cached skill analysis └── <skill-name>/ └── cases/ └── case-NNN-<slug>/ ├── input.yaml # Scenario input └── annotations.yaml # Expected outcomes

@@

/eval-analyze --skill <name> --config evals/<name>.yaml

@@

/eval-dataset --config evals/<name>.yaml

@@

/eval-run --model claude-opus-4-6 --config evals/<name>.yaml

@@

/eval-review --run-id <run-id> --config evals/<name>.yaml

@@

/eval-optimize --config evals/<name>.yaml

</details> <details> <summary>🧰 Tools</summary> <details> <summary>🪛 markdownlint-cli2 (0.22.1)</summary> [warning] 39-39: Fenced code blocks should have a language specified (MD040, fenced-code-language) --- [warning] 53-53: Fenced code blocks should have a language specified (MD040, fenced-code-language) --- [warning] 58-58: Fenced code blocks should have a language specified (MD040, fenced-code-language) --- [warning] 63-63: Fenced code blocks should have a language specified (MD040, fenced-code-language) --- [warning] 68-68: Fenced code blocks should have a language specified (MD040, fenced-code-language) --- [warning] 73-73: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> <details> <summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @plugins/two-node/evals/README.md around lines 39 - 75, The README.md has
fenced code blocks without language identifiers (the directory tree block and
the command examples like /eval-analyze, /eval-dataset, /eval-run, /eval-review,
/eval-optimize), causing markdownlint MD040 failures; fix by adding appropriate
language tags: mark the tree block as text and each command block as bash
in plugins/two-node/evals/README.md so the blocks become text for the evals/ tree and bash for the /eval-analyze, /eval-dataset, /eval-run, /eval-review,
and /eval-optimize command snippets.

</details>   _Source: Linters/SAST tools_ 

coderabbitai · 2026-06-11T18:54:26Z

+      markers = re.findall(r'\b[XxNn/Aa~-]\b', stride_section[1][:2000])
+      if len(markers) < 3:
+          return (False, f"STRIDE matrix appears empty or minimal ({len(markers)} markers)")
+      return (True, f"STRIDE matrix populated ({len(markers)} cell markers found)")


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash python - <<'PY' import re sample = "X ~ - N/A x n/a" bad = re.findall(r'\b[XxNn/Aa~-]\b', sample) good = re.findall(r'(?i)\b(?:x|~|-|n/?a)\b', sample) print("bad:", bad) print("good:", good) PY

Repository: openshift-eng/edge-tooling

Length of output: 147

Fix STRIDE marker regex to correctly capture N/A and standalone ~/- cells

r'\b[XxNn/Aa~-]\b' splits N/A and misses ~ and - (e.g., "X ~ - N/A x n/a" yields ['X', 'N', '/', 'A', 'x', 'n', '/', 'a'] instead of treating N/A as one marker).

Suggested fix

- markers = re.findall(r'\b[XxNn/Aa~-]\b', stride_section[1][:2000]) + markers = re.findall(r'(?i)(?<![A-Za-z0-9])(?:x|~|-|n/?a)(?![A-Za-z0-9])', stride_section[1][:2000])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

markers = re.findall(r'\b[XxNn/Aa~-]\b', stride_section[1][:2000])

if len(markers) < 3:

return (False, f"STRIDE matrix appears empty or minimal ({len(markers)} markers)")

return (True, f"STRIDE matrix populated ({len(markers)} cell markers found)")

markers = re.findall(r'(?i)(?<![A-Za-z0-9])(?:x|~|-|n/?a)(?![A-Za-z0-9])', stride_section[1][:2000])

if len(markers) < 3:

return (False, f"STRIDE matrix appears empty or minimal ({len(markers)} markers)")

return (True, f"STRIDE matrix populated ({len(markers)} cell markers found)")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/two-node/evals/threat-model-tnf.yaml` around lines 149 - 152, The regex r'\b[XxNn/Aa~-]\b' incorrectly splits multi-char markers like "N/A" and miscaptures standalone "~" and "-"—replace the pattern used in the re.findall call to match whole markers and use case-insensitive matching: change the pattern to r'\b(?:X|N/A|~|-)\b' and call re.findall with flags=re.IGNORECASE (i.e., re.findall(r'\b(?:X|N/A|~|-)\b', stride_section[1][:2000], flags=re.IGNORECASE)) so markers, including "N/A" and "~"/"-", are captured as single tokens (referencing the markers variable, the re.findall call, and stride_section).

dhensel-rh · 2026-06-11T18:59:58Z

Closing — eval configs are tracked in #178 (OCPEDGE-2727).

dhensel-rh and others added 9 commits June 11, 2026 14:28

Add threat-model skill plugin with TNF, TNA, SNO, and LVMS support

25b476c

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: add missing blank lines to pass markdownlint (MD031/MD032)

bbfce72

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update evals README with detailed pipeline steps

c63b86f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2026

dhensel-rh mentioned this pull request Jun 11, 2026

OCPEDGE-2727: Add agent-eval-harness integration config for cluster-diagnostic skill #178

Draft

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

dhensel-rh closed this Jun 11, 2026

Conversation

dhensel-rh commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

openshift-ci Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Uh oh!

openshift-ci Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

dhensel-rh commented Jun 11, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dhensel-rh commented Jun 11, 2026 •

edited

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

dhensel-rh commented Jun 11, 2026 •

edited by openshift-ci Bot

Loading