Skip to content

test(scenarios): reproducible test scenarios harness#574

Merged
NagyVikt merged 1 commit into
mainfrom
worktree-agent-a09400946c5e0fce3
May 15, 2026
Merged

test(scenarios): reproducible test scenarios harness#574
NagyVikt merged 1 commit into
mainfrom
worktree-agent-a09400946c5e0fce3

Conversation

@NagyVikt
Copy link
Copy Markdown
Collaborator

Summary

Adds tests/scenarios/ — a reproducible, in-process scenarios harness for canonical multi-agent situations, with deterministic clocks and diffable plaintext expected-state files.

  • 5 canonical scenarios as seed.sql + inputs.jsonl + expected.json
    1. 01-claim-before-edit — Codex pre_tool_use auto-claims target
    2. 02-cross-runtime-handoff — Codex relays out, Claude accepts, claim flips
    3. 03-stale-claim-sweep — TTL expiry triggers release_expired_quota
    4. 04-plan-claim-adoption — Queen sub-task adopted by Codex
    5. 05-path-mismatch-reclaim — Pre_tool_use re-claims after mismatch
  • Harness uses vi.useFakeTimers() + vi.setSystemTime(BASE_TS + at_ms) + path normalization
  • 4 scripts: pnpm scenarios, scenarios:filter, scenarios:explain, scenarios:record
  • 2 harness self-tests (fails-closed on missing expected, clear diff on mismatch)
  • Separate scenarios CI job on Node 20

Closes ⏳ Reproducible test fixture set under tests/scenarios/ under README §v0.x "Multi-runtime confidence".

OpenSpec

openspec/changes/scenarios-harness-2026-05-16/CHANGE.md

Test plan

  • pnpm scenarios — 7/7 pass (5 scenarios + 2 self-tests, 1.95s)
  • pnpm scenarios:filter 03-stale-claim-sweep — 1 pass, 6 skip
  • pnpm scenarios:explain 02-cross-runtime-handoff — prints timeline
  • pnpm scenarios:record <slug> — confirmed regenerates expected.json
  • pnpm build — clean

Notes

  • Uses .mts for harness internals (tsx CJS-fallback workaround for @colony/compress ESM-only exports)
  • Adds task envelope kind beyond the design's lifecycle|mcp|tick triplet to drive TaskThread directly
  • Pre-flips the v0.x README line ⏳ → ✅ — revert if you prefer to land that separately

Merge order

Independent of #573 (bridge replay). Safe to merge second.

🤖 Generated with Claude Code

- 5 canonical multi-agent scenarios as seed.sql + inputs.jsonl + expected.json
- In-process harness with vi.useFakeTimers + path normalization
- pnpm scenarios / scenarios:filter / scenarios:explain / scenarios:record
- 2 harness self-tests (fails-closed on missing expected, clear diff on mismatch)
- Separate scenarios CI job on Node 20
@NagyVikt NagyVikt merged commit f241ce8 into main May 15, 2026
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant