From 4c9e8dc2db32a9c1aeb1c2f86c26a99a1a027f2f Mon Sep 17 00:00:00 2001 From: Sven Date: Mon, 2 Mar 2026 05:50:50 +0100 Subject: [PATCH] feat(skills): add Batch Execution, SDD, and Writing Plans skills Co-Authored-By: Claude Opus 4.6 --- templates/skills/batch-execution/SKILL.md | 121 ++++++++++++ .../subagent-driven-development/SKILL.md | 126 ++++++++++++ templates/skills/writing-plans/SKILL.md | 181 ++++++++++++++++++ 3 files changed, 428 insertions(+) create mode 100644 templates/skills/batch-execution/SKILL.md create mode 100644 templates/skills/subagent-driven-development/SKILL.md create mode 100644 templates/skills/writing-plans/SKILL.md diff --git a/templates/skills/batch-execution/SKILL.md b/templates/skills/batch-execution/SKILL.md new file mode 100644 index 0000000..79206e7 --- /dev/null +++ b/templates/skills/batch-execution/SKILL.md @@ -0,0 +1,121 @@ +--- +name: batch-execution +description: Use when a task can be decomposed into 5-30 independent units — spawns parallel agents in isolated git worktrees, each producing its own PR +context: fork +--- + +# Batch Execution + +Parallel work multiplies throughput. Sequential execution of independent tasks wastes time. + +**If the tasks are independent, they should run in parallel.** + +## The Iron Law + + +INDEPENDENT TASKS MUST RUN IN PARALLEL. +If you have identified 5+ independent work units, you CANNOT execute them sequentially. +"One at a time is simpler" is waste, not caution. +Violating this rule is a violation — not a preference. + + +## The Gate Function + +### 1. DECOMPOSE — Break Work Into Independent Units + +- Analyze the task for independent, self-contained units +- Each unit must: modify different files, be mergeable alone, not depend on sibling units +- Target 5-30 units depending on scope (few files → 5, many files → 30) +- Units should be roughly uniform in size + +### 2. VERIFY INDEPENDENCE — Check for Conflicts + +- No two units modify the same file (or the same section of a shared file) +- No unit depends on another unit's output to start +- Each unit can be tested in isolation +- Merge order does not matter + +### 3. SPAWN — Create Isolated Workers + +- Each worker gets its own git worktree (isolated copy of the repo) +- Each worker receives: the overall goal, its specific unit task, codebase conventions, verification recipe +- All workers launch simultaneously in a single message +- Workers run in background — do not block on individual completion + +### 4. MONITOR — Track Progress + +Maintain a status table: + +| # | Unit | Status | PR | +|---|------|--------|----| +| 1 | | running / done / failed | <url> | + +- Update as workers report completion +- Track failures separately with brief error notes + +### 5. VERIFY — Check Each Worker's Output + +- Each worker must: run tests, verify build, commit, push, create PR +- Failed workers are retried once with the error context +- If a worker fails twice, mark as failed and note the reason + +### 6. AGGREGATE — Merge Results + +- All PRs should pass CI independently +- Merge in any order (independence guarantee) +- Produce a final summary: X/Y units completed as PRs + +## Common Rationalizations — REJECT THESE + +| Excuse | Why It Violates the Rule | +|--------|--------------------------| +| "Sequential is simpler" | Simpler for you, slower for the user. Parallel is the job. | +| "What if they conflict?" | Verify independence first (step 2). If they conflict, they are not independent — re-decompose. | +| "Too many agents" | The threshold is 5 units. Below 5, sequential is acceptable. Above 5, parallelize. | +| "I'll batch the small ones" | Small units are the easiest to parallelize. Do not combine them. | +| "Worktrees are complex" | Worktree creation is automated. Your job is decomposition and monitoring. | + +## Red Flags — STOP If You Catch Yourself: + +- Running independent tasks one at a time +- Creating units that modify the same file +- Launching workers without independence verification +- Not monitoring worker progress +- Merging without checking each PR passes CI + +**If any red flag triggers: STOP. Re-decompose or verify independence before proceeding.** + +## Verification Checklist + +Before claiming batch execution is complete: + +- [ ] All work units were independent (no shared file modifications) +- [ ] All workers ran in isolated git worktrees +- [ ] All workers were launched simultaneously (not sequentially) +- [ ] Progress was tracked via status table +- [ ] Each completed worker produced a PR +- [ ] Failed workers were retried once or documented +- [ ] Final summary shows completion rate + +## Integration with MAXSIM + +### Context Loading + +```bash +node ~/.claude/maxsim/bin/maxsim-tools.cjs skill-context batch-execution +``` + +### In Plan Execution + +Batch execution applies when a plan has multiple independent tasks in the same wave: +- The orchestrator identifies independent tasks within a wave +- Each task is assigned to a worker in an isolated worktree +- Workers follow the full task protocol (implement → simplify → verify → commit) +- The orchestrator aggregates results and updates the phase status + +### STATE.md Hooks + +- Record batch execution start with unit count +- Track completion rate as workers finish +- Record final aggregate result (X/Y succeeded) +- Failed units become blockers for follow-up diff --git a/templates/skills/subagent-driven-development/SKILL.md b/templates/skills/subagent-driven-development/SKILL.md new file mode 100644 index 0000000..e961e76 --- /dev/null +++ b/templates/skills/subagent-driven-development/SKILL.md @@ -0,0 +1,126 @@ +--- +name: subagent-driven-development +description: Use when executing multi-task plans — spawns a fresh subagent per task with 2-stage review between tasks to prevent context rot +context: fork +--- + +# Subagent-Driven Development (SDD) + +Context rots. Fresh agents make fewer mistakes than tired ones. + +**If your context is deep, your next task deserves a fresh agent.** + +## The Iron Law + +<HARD-GATE> +ONE TASK PER SUBAGENT. +Each task in a plan gets a fresh subagent with clean context. +"I'll just keep going" produces context-rotted code. +Violating this rule is a violation — not efficiency. +</HARD-GATE> + +## The Gate Function + +### 1. PREPARE — Assemble Task Context + +For each task in the plan: +- Extract ONLY the files and sections relevant to this specific task +- Include: task description, verify block, done criteria, relevant code files +- Exclude: other tasks' context, completed task details, unrelated code +- Context should be minimal and focused — less is more + +### 2. SPAWN — Fresh Agent Per Task + +- Create a new subagent with ONLY the task-specific context +- The subagent receives: task spec, relevant files, codebase conventions, verification recipe +- The subagent does NOT receive: other tasks, full plan, accumulated session context +- Each subagent starts with maximum available context window + +### 3. EXECUTE — Task Implementation + +The subagent follows the standard task protocol: +1. Read and understand the task requirements +2. Implement using TDD (if applicable) +3. Run verification commands +4. Produce evidence block +5. Commit with task-specific message + +### 4. REVIEW — 2-Stage Review Between Tasks + +After each task completes, before starting the next: + +**Stage 1 (Spec Review):** Does the implementation match the task's `<done>` criteria exactly? +**Stage 2 (Code Review):** Does the code meet quality standards? + +If either stage fails: the task is not complete. Fix issues before proceeding. + +### 5. HANDOFF — Transfer Context to Next Task + +- Record what was done (files changed, decisions made) +- DO NOT carry forward the full implementation context +- The next subagent starts fresh — it reads the committed code, not the session history +- Checkpoint the progress in STATE.md + +### 6. REPEAT — Next task, fresh agent + +## Common Rationalizations — REJECT THESE + +| Excuse | Why It Violates the Rule | +|--------|--------------------------| +| "I have context from the last task" | That context is rotting. Fresh agent, fresh perspective. | +| "Spawning agents is overhead" | Context rot costs more than spawn time. The math is clear. | +| "I can do multiple tasks efficiently" | Efficiency without accuracy is waste. One task, one agent. | +| "The tasks are related" | Related tasks still get separate agents. Share via committed code, not session state. | +| "Review between tasks slows things down" | Review catches what rot misses. The slowdown is an investment. | + +## Red Flags — STOP If You Catch Yourself: + +- Executing multiple tasks in the same agent context +- Carrying forward accumulated context to the next task +- Skipping the 2-stage review between tasks +- Loading the entire plan into a single agent +- Not checkpointing progress between tasks + +**If any red flag triggers: STOP. Checkpoint, spawn fresh agent, continue.** + +## Verification Checklist + +Before claiming SDD execution is complete: + +- [ ] Each task was executed by a separate, fresh subagent +- [ ] Each subagent received only task-relevant context +- [ ] 2-stage review ran between every pair of tasks +- [ ] All review issues were resolved before proceeding +- [ ] Progress was checkpointed in STATE.md between tasks +- [ ] No accumulated context was carried forward + +## Integration with MAXSIM + +### Context Loading + +```bash +node ~/.claude/maxsim/bin/maxsim-tools.cjs skill-context subagent-driven-development +``` + +### Task-Based Context Loading (EXEC-03) + +The key innovation of SDD is task-based context loading: +- Each subagent receives only the files/sections relevant to its assignment +- The orchestrator determines relevant files from the task's file list in the plan +- Additional context is loaded on-demand if the subagent needs it +- This prevents context bloat and maximizes each agent's effective context window + +### STATE.md Hooks + +- Record task start/complete with subagent assignment +- Checkpoint after each task for resume capability +- Track inter-task review results +- Record any deviations from the plan + +### In Plan Execution + +SDD is the default execution model for MAXSIM plans: +- The orchestrator reads the plan's task list +- For each task (or wave of parallel tasks), fresh subagents are spawned +- Between tasks/waves, 2-stage review runs +- The orchestrator never executes tasks itself — it only coordinates diff --git a/templates/skills/writing-plans/SKILL.md b/templates/skills/writing-plans/SKILL.md new file mode 100644 index 0000000..18b934e --- /dev/null +++ b/templates/skills/writing-plans/SKILL.md @@ -0,0 +1,181 @@ +--- +name: writing-plans +description: Use when creating PLAN.md files for MAXSIM phases — standardizes plan format with TDD-style task definitions, dependency detection, and wave grouping +context: fork +--- + +# Writing Plans + +A plan is a contract between the planner and the executor. Vague plans produce vague code. + +**If the executor cannot execute your plan without asking questions, the plan is incomplete.** + +## The Iron Law + +<HARD-GATE> +EVERY TASK MUST HAVE VERIFY AND DONE BLOCKS. +If a task does not specify how to verify it and when it is done, it is not a task — it is a wish. +"The executor will figure it out" is abdication, not delegation. +Violating this rule is a violation — not flexibility. +</HARD-GATE> + +## The Gate Function + +### 1. DECOMPOSE — Break Phase Goal Into Tasks + +- Each task is a single, atomic unit of work +- Tasks should take 15-60 minutes for a focused agent +- Tasks too large should be split; tasks too small should be merged +- Every task must produce a committable result + +### 2. SPECIFY — Define Each Task Completely + +Each task MUST include: + +```markdown +### Task N: [Descriptive Title] + +**Files:** [list of files to create/modify] + +**Description:** [What to implement, with enough detail that another agent can do it without asking questions] + +<verify> +[Exact commands to run to verify the task is complete] +npm run build +npm test +[specific test command if applicable] +</verify> + +<done> +- [ ] [Specific, testable criterion 1] +- [ ] [Specific, testable criterion 2] +- [ ] [Specific, testable criterion 3] +</done> +``` + +### 3. DEPEND — Detect Dependencies Between Tasks + +- If Task B reads a file that Task A creates → B depends on A +- If Task B calls a function that Task A implements → B depends on A +- If Task B modifies the same file as Task A → they CANNOT be parallel +- Dependencies must be explicit — implicit ordering is a bug + +### 4. WAVE — Group Independent Tasks + +Tasks are grouped into waves for execution: +- **Wave 1:** Tasks with no dependencies (can all run in parallel) +- **Wave 2:** Tasks that depend only on Wave 1 tasks +- **Wave N:** Tasks that depend on Wave N-1 tasks + +```markdown +## Execution Order + +**Wave 1** (parallel): +- Task 1: [title] +- Task 2: [title] + +**Wave 2** (parallel, after Wave 1): +- Task 3: [title] — depends on Task 1 +- Task 4: [title] — depends on Task 2 + +**Wave 3** (sequential): +- Task 5: [title] — depends on Task 3, Task 4 +``` + +### 5. VALIDATE — Check Plan Completeness + +Before submitting the plan: +- Does every task have `<verify>` and `<done>` blocks? +- Does the task set cover the phase's success criteria completely? +- Are dependencies correct and complete? +- Are waves correctly ordered? +- Could an executor run this plan without asking questions? + +## Plan Structure Template + +```markdown +# Phase [N] Plan [M]: [Phase Name] + +## Overview +[1-2 sentences: what this plan achieves and why] + +## Tasks + +### Task 1: [Title] +... + +### Task 2: [Title] +... + +## Execution Order + +**Wave 1** (parallel): Tasks 1, 2 +**Wave 2** (parallel): Tasks 3, 4 +**Wave 3** (sequential): Task 5 + +## Verification + +After all tasks complete: +- [ ] [Phase-level success criterion 1] +- [ ] [Phase-level success criterion 2] +``` + +## Common Rationalizations — REJECT THESE + +| Excuse | Why It Violates the Rule | +|--------|--------------------------| +| "The executor is smart enough to figure it out" | Smart executors with vague plans produce inconsistent results. Be explicit. | +| "Verify blocks are obvious" | If they are obvious, writing them takes 10 seconds. Do it. | +| "Done criteria are implicit in the description" | Implicit criteria cannot be checked. Make them explicit checkboxes. | +| "Dependencies are clear from context" | Context dies between agents. Write dependencies explicitly. | +| "Wave grouping is premature optimization" | Wave grouping enables parallel execution. It is not optimization — it is correctness. | +| "The plan is too detailed" | Plans cannot be too detailed. They can only be too vague. | + +## Red Flags — STOP If You Catch Yourself: + +- Writing a task without a `<verify>` block +- Writing a `<done>` criterion that cannot be tested by running a command +- Assuming the executor knows which files to modify without listing them +- Creating a plan with no wave grouping (implies everything is sequential) +- Submitting a plan you could not execute yourself without asking questions + +**If any red flag triggers: STOP. Add the missing specificity.** + +## Verification Checklist + +Before submitting a plan: + +- [ ] Every task has a `<verify>` block with runnable commands +- [ ] Every task has a `<done>` block with testable criteria +- [ ] Every task lists the files it will create or modify +- [ ] Dependencies between tasks are explicitly stated +- [ ] Tasks are grouped into waves with correct ordering +- [ ] The task set covers all phase success criteria +- [ ] An executor could run this plan without asking questions + +## Integration with MAXSIM + +### Context Loading + +```bash +node ~/.claude/maxsim/bin/maxsim-tools.cjs skill-context writing-plans +``` + +### Plan Naming Convention + +Plans are numbered per phase: `{phase_number}-{plan_number}-PLAN.md` +- Example: `04-01-PLAN.md` (Phase 4, Plan 1) +- Multiple plans per phase are allowed (for large phases) + +### Artifact References + +- Reference `.planning/ROADMAP.md` for phase success criteria +- Reference `.planning/phases/{current}/RESEARCH.md` for implementation findings +- Reference `.planning/phases/{current}/CONTEXT.md` for user decisions +- Reference `.planning/codebase/STRUCTURE.md` for file organization conventions + +### STATE.md Hooks + +- Record plan creation as a milestone in STATE.md +- Update current position to reflect the new plan +- Plans feed into MAXSIM's plan-check verification before execution begins