V2.alpha-2 Feat/new mechanisms by Thibaut-Fatus · Pull Request #6 · korabench/benchmark

Thibaut-Fatus · 2026-04-17T15:16:17Z

No description provided.

Adds narrow filters to generate-seeds and expand-scenarios so partial samples (e.g. a single new risk) can be run without regenerating the full taxonomy. --risk-ids applies to both commands; --motivations applies to generate-seeds. Unknown values fail fast with a clear error.

- New `Mechanism` model + `packages/benchmark/data/mechanisms.ts` with the 7 conversation mechanisms from Kora Taxonomy V2. - Rename BehaviorAssessment → MechanismAssessment; schema built dynamically from Mechanism.listAll() so adding a mechanism to the data file extends the judge schema, aggregation, and run sums. - Keep existing rubric text for M2/M6/M7 (active); attach Excel V2 rubric as JS comments so the switch is a one-line edit later. - Add M1 Sycophancy, M3 Manipulative Engagement, M4 Non-Manipulative Framing, M5 Fictional Framing & Roleplay Bypass. M5 is scaffolded as a single-framing flag; the comparative multi-framing pipeline is deferred. - RunSums replaces fixed an/eh/hr keys with a `mechanisms` record keyed by mechanism id. - `run` CLI gains `--risk-ids` (filter by risk) and `--limit` (cap test tasks) for smoke tests; empty result set now errors loudly. - README updated (Mechanisms section, new sums shape, run options).

Thibaut-Fatus added 4 commits April 16, 2026 15:43

[feat] keep node LTS

6d9f4b5

[feat] add/rm risks following existing format

2e24f8f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V2.alpha-2 Feat/new mechanisms#6

V2.alpha-2 Feat/new mechanisms#6
Thibaut-Fatus wants to merge 4 commits intov2from
feat/new-mechanisms

Thibaut-Fatus commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Thibaut-Fatus commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant