Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions packages/cli/src/core/init.ts
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ export interface ExecutePhaseContext {
state_path: string;
roadmap_path: string;
config_path: string;
skill_paths: string;
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecutePhaseContext is exported from the published CLI package; adding a new required field (skill_paths) is a breaking TypeScript change for any external consumers that construct this type. Consider making it optional (or adding a new optional field) to preserve compatibility, especially since it’s not yet used elsewhere in the codebase.

Suggested change
skill_paths: string;
skill_paths?: string;

Copilot uses AI. Check for mistakes.
}

export interface PlanPhaseContext {
Expand Down Expand Up @@ -393,6 +394,8 @@ export function cmdInitExecutePhase(cwd: string, phase: string | undefined, raw:
const milestone = getMilestoneInfo(cwd);
const phase_req_ids = extractReqIds(cwd, phase!);

const skillPaths = path.join(os.homedir(), '.claude', 'skills');

const result: ExecutePhaseContext = {
executor_model: resolveModelInternal(cwd, 'maxsim-executor'),
verifier_model: resolveModelInternal(cwd, 'maxsim-verifier'),
Expand Down Expand Up @@ -431,6 +434,7 @@ export function cmdInitExecutePhase(cwd: string, phase: string | undefined, raw:
state_path: '.planning/STATE.md',
roadmap_path: '.planning/ROADMAP.md',
config_path: '.planning/config.json',
skill_paths: skillPaths,
};
Comment on lines 397 to 438
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skill_paths is set to a single global directory ($HOME/.claude/skills). The templates also mention a project-local .claude/skills fallback/override and the field name is plural, suggesting multiple lookup locations. Consider outputting an ordered list of skill search paths (project-local first, then home) or renaming the field to skill_path if only one location is intended.

Copilot uses AI. Check for mistakes.

output(result, raw);
Expand Down
61 changes: 48 additions & 13 deletions templates/agents/maxsim-executor.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Before executing, discover project context:

**Self-improvement lessons:** Read `.planning/LESSONS.md` if it exists — accumulated lessons from past executions on this codebase. Apply them proactively to avoid known mistakes before they become deviations.

**Project skills:** Check `.agents/skills/` directory if it exists:
**Project skills:** Check `~/.claude/skills/` directory if it exists (also check `.claude/skills/` in the project root as a fallback):
1. List available skills (subdirectories)
2. Read `SKILL.md` for each skill (lightweight index ~130 lines)
3. Load specific `rules/*.md` files as needed during implementation
Expand Down Expand Up @@ -80,22 +80,57 @@ grep -n "type=\"checkpoint" [plan-path]
**Pattern C: Continuation** — Check `<completed_tasks>` in prompt, verify commits exist, resume from specified task.
</step>

<step name="task_context_loading">
## Task-Based Context Loading (EXEC-03)

For each task, load ONLY the files listed in the task's `Files:` field — not the entire codebase.

1. Call `skill-context` or read the plan to get the task's file list
2. Use the `Read` tool to load only those specific files
3. If the task has no `Files:` field, load files referenced in the task description
Comment on lines +88 to +90
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

task_context_loading instructs “Call skill-context” to get a task’s file list, but there’s no other reference in the repo to a skill-context command/tool. If this is meant to be a MAXSIM tool, it should be named consistently with other documented commands (e.g., maxsim-tools.cjs ...) or replaced with explicit instructions to parse the Files: field from the plan.

Suggested change
1. Call `skill-context` or read the plan to get the task's file list
2. Use the `Read` tool to load only those specific files
3. If the task has no `Files:` field, load files referenced in the task description
1. Read the plan (e.g., `PLAN.md`) and parse the current task's `Files:` field to get the task's file list
2. Use the `Read` tool to load only those specific files
3. If the task has no `Files:` field, load files explicitly referenced in the task description

Copilot uses AI. Check for mistakes.
4. Do NOT speculatively read the entire `src/` directory or similar broad paths

This keeps executor context lean and focused per task.
</step>

<step name="execute_tasks">
For each task:
For each task, follow the Execute → Simplify → Verify → Commit cycle:

1. **If `type="auto"`:**
- Check for `tdd="true"` → follow TDD execution flow
- Execute task, apply deviation rules as needed
- Handle auth errors as authentication gates
- Run verification, confirm done criteria
- Commit (see task_commit_protocol)
- Track completion + commit hash for Summary
- **Execute:** Check for `tdd="true"` → follow TDD execution flow. Otherwise implement task, apply deviation rules as needed. Handle auth errors as authentication gates.
- **Simplify:** Run a simplification pass on files modified by this task — check for duplication, dead code, complexity. Only simplify behavior-preserving changes. Skip if task is config/docs only or fewer than 10 lines changed.
- **Verify:** Run verification, confirm done criteria. If simplification broke something, revert simplification and re-verify.
- **Commit:** Commit (see task_commit_protocol). Track completion + commit hash for Summary.
- **Update progress table** (see progress_tracking).

2. **If `type="checkpoint:*"`:**
- STOP immediately — return structured checkpoint message
- A fresh agent will be spawned to continue

3. After all tasks: run overall verification, confirm success criteria, document deviations
3. After all tasks in a wave: run **wave code review** (see wave_review_protocol).
4. After all waves: run overall verification, confirm success criteria, document deviations.
</step>

<step name="progress_tracking">
## Orchestrator Status Tracking (EXEC-02)

Maintain a progress table throughout execution. Update after each task state change:

```markdown
| Wave | Task | Status | Stage |
|------|------|--------|-------|
| 1 | Task 1 | Complete | Committed |
| 1 | Task 2 | In Progress | Simplifying |
| 2 | Task 3 | Blocked | Waiting for Wave 1 |
```

**Stages:** Executing → Simplifying → Verifying → Committed → Reviewed

**Rules:**
- Update the table in your working state after each task stage transition
- Include the table in checkpoint returns so continuation agents have full state
- Include the final table in the SUMMARY.md under `## Execution Progress`
- If a task is blocked or failed, record the reason in the Status column
</step>

</execution_flow>
Expand Down Expand Up @@ -612,11 +647,11 @@ Do not rely on memory of the skill content — always read the file fresh.

| Skill | Read | Trigger |
|-------|------|---------|
| TDD Enforcement | `.agents/skills/tdd/SKILL.md` | Before writing implementation code for a new feature, bug fix, or when plan type is `tdd` |
| Systematic Debugging | `.agents/skills/systematic-debugging/SKILL.md` | When encountering any bug, test failure, or unexpected behavior during execution |
| Verification Before Completion | `.agents/skills/verification-before-completion/SKILL.md` | Before claiming any task is done, fixed, or passing |
| TDD Enforcement | `~/.claude/skills/tdd/SKILL.md` | Before writing implementation code for a new feature, bug fix, or when plan type is `tdd` |
| Systematic Debugging | `~/.claude/skills/systematic-debugging/SKILL.md` | When encountering any bug, test failure, or unexpected behavior during execution |
| Verification Before Completion | `~/.claude/skills/verification-before-completion/SKILL.md` | Before claiming any task is done, fixed, or passing |

**Project skills override built-in skills.** If a skill with the same name exists in `.agents/skills/` in the project, load that one instead.
**Project skills override built-in skills.** If a skill with the same name exists in `~/.claude/skills/` or `.claude/skills/` in the project, load that one instead.

</available_skills>

Expand Down
52 changes: 44 additions & 8 deletions templates/workflows/execute-phase.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ Execute each wave in sequence. Within a wave: parallel if `PARALLELIZATION=true`
- .planning/STATE.md (State)
- .planning/config.json (Config, if exists)
- ./CLAUDE.md (Project instructions, if exists — follow project-specific guidelines and coding conventions)
- .agents/skills/ (Project skills, if exists — list skills, read SKILL.md for each, follow relevant rules during implementation)
- ~/.claude/skills/ (Skills, if exists — list skills, read SKILL.md for each, follow relevant rules during implementation)
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<files_to_read> now lists only ~/.claude/skills/, but the executor template also mentions a project-local .claude/skills/ fallback/override. If project-local skills are supported, the orchestrator’s startup context list should include them as well so executors discover the correct skill set consistently.

Suggested change
- ~/.claude/skills/ (Skills, if exists — list skills, read SKILL.md for each, follow relevant rules during implementation)
- ./.claude/skills/ (Project-local skills, if exists — override or extend global skills; list skills and read SKILL.md for each)
- ~/.claude/skills/ (Global skills, if exists — list skills, read SKILL.md for each, follow relevant rules during implementation)

Copilot uses AI. Check for mistakes.
</files_to_read>
Comment on lines 152 to 156
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says skill path references were updated “across executor and orchestrator”, but there are still multiple templates referencing .agents/skills/ (e.g., templates/workflows/plan-phase.md, templates/workflows/quick.md, and several agent templates like maxsim-planner.md). If the skills directory move is intended repo-wide, those remaining references will keep pointing at the old location and create inconsistent behavior/documentation.

Copilot uses AI. Check for mistakes.

<success_criteria>
Expand All @@ -177,7 +177,21 @@ Execute each wave in sequence. Within a wave: parallel if `PARALLELIZATION=true`

If ANY spot-check fails: report which plan failed, route to failure handler — ask "Retry plan?" or "Continue with remaining waves?"

If pass — **emit plan-complete lifecycle event** (if `DASHBOARD_ACTIVE`):
If pass — **verify wave results with code review:**

Review the wave's combined changes for spec compliance and code quality:
```bash
# Get all files changed in this wave
WAVE_FIRST_COMMIT=$(git log --oneline --all --grep="{phase}-{first_plan_in_wave}" --reverse | head -1 | cut -d' ' -f1)
Comment on lines +184 to +185
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggested command uses --grep="{phase}-{first_plan_in_wave}", but first_plan_in_wave isn’t defined anywhere else in this workflow template (unlike {phase}-{plan} above). As written, this is likely to be copy-pasted and fail or produce an empty commit hash. Consider computing the first plan ID for the wave explicitly (or using the collected plan IDs/commit hashes from the wave) and reference that variable here.

Suggested change
# Get all files changed in this wave
WAVE_FIRST_COMMIT=$(git log --oneline --all --grep="{phase}-{first_plan_in_wave}" --reverse | head -1 | cut -d' ' -f1)
# Get all files changed in this wave.
# Assumes $WAVE_FIRST_COMMIT was recorded when starting the wave (the first commit for this wave).

Copilot uses AI. Check for mistakes.
git diff ${WAVE_FIRST_COMMIT}^..HEAD --name-only
```

- **Spec compliance:** Cross-check each plan's `<done>` criteria against actual implementation
- **Code quality:** Scan for inconsistent patterns, missing error handling, hardcoded values
- If blocking issues found: fix before proceeding to next wave
- Record review verdict: `Wave {N} Review: PASS` or `Wave {N} Review: PASS after fixes (N fixes)`

**Emit plan-complete lifecycle event** (if `DASHBOARD_ACTIVE`):
```
mcp__maxsim-dashboard__submit_lifecycle_event(
event_type: "plan-complete",
Expand All @@ -186,6 +200,15 @@ Execute each wave in sequence. Within a wave: parallel if `PARALLELIZATION=true`
)
```

**Update progress table** (maintain throughout execution):
```markdown
| Wave | Plan | Status | Review |
|------|------|--------|--------|
| 1 | 01-01 | Complete | Passed |
| 1 | 01-02 | Complete | Passed |
| 2 | 01-03 | In Progress | Pending |
```

Then report:
```
---
Expand All @@ -194,13 +217,14 @@ Execute each wave in sequence. Within a wave: parallel if `PARALLELIZATION=true`
**{Plan ID}: {Plan Name}**
{What was built — from SUMMARY.md}
{Notable deviations, if any}
{Wave review verdict}

{If more waves: what this enables for next wave}
---
```

- Bad: "Wave 2 complete. Proceeding to Wave 3."
- Good: "Terrain system complete — 3 biome types, height-based texturing, physics collision meshes. Vehicle physics (Wave 3) can now reference ground surfaces."
- Good: "Terrain system complete — 3 biome types, height-based texturing, physics collision meshes. Wave review: PASS. Vehicle physics (Wave 3) can now reference ground surfaces."

5. **Handle failures:**

Expand Down Expand Up @@ -265,19 +289,31 @@ After all waves:

**Waves:** {N} | **Plans:** {M}/{total} complete

| Wave | Plans | Status |
|------|-------|--------|
| 1 | plan-01, plan-02 | Complete |
| CP | plan-03 | Verified |
| 2 | plan-04 | Complete |
| Wave | Plans | Status | Review |
|------|-------|--------|--------|
| 1 | plan-01, plan-02 | Complete | Passed |
| CP | plan-03 | Verified | Passed |
| 2 | plan-04 | Complete | Passed after 1 fix |

### Plan Details
1. **03-01**: [one-liner from SUMMARY.md]
2. **03-02**: [one-liner from SUMMARY.md]

### Wave Reviews
| Wave | Spec Review | Code Review | Fixes Applied |
|------|------------|-------------|---------------|
| 1 | Pass | Pass | 0 |
| 2 | Pass | Pass after fix | 1 |

### Issues Encountered
[Aggregate from SUMMARYs, or "None"]
```

Aggregate task results from all executor agents. For each plan's SUMMARY.md, extract:
- One-liner description
- Deviation count and categories
- Wave review verdicts
- Any deferred issues
</step>

<step name="close_parent_artifacts">
Expand Down
66 changes: 65 additions & 1 deletion templates/workflows/execute-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,13 +144,32 @@ Deviations are normal — handle via rules below.

1. Read @context files from prompt
2. Per task:
- `type="auto"`: if `tdd="true"` → TDD execution. Implement with deviation rules + auth gates. Verify done criteria. Commit (see task_commit). Track hash for Summary.
- `type="auto"`: if `tdd="true"` → TDD execution. Implement with deviation rules + auth gates. Verify done criteria. **Simplify** (see simplify_pass). Re-verify. Commit (see task_commit). Track hash for Summary.
- `type="checkpoint:*"`: STOP → checkpoint_protocol → wait for user → continue only after confirmation.
3. Run `<verification>` checks
4. Confirm `<success_criteria>` met
5. Document deviations in Summary
</step>

<step name="simplify_pass">
## Post-Task Simplification

After each task's implementation passes tests but BEFORE committing, run a simplification pass on the files modified by that task:

1. **Duplication check:** Scan modified files for copy-pasted blocks, near-identical functions, repeated patterns. Extract shared helpers where 3+ lines repeat.
2. **Dead code removal:** Remove unused imports, unreachable branches, commented-out code, unused variables/functions introduced by this task.
3. **Complexity reduction:** Simplify nested conditionals (early returns), flatten callback chains, replace verbose patterns with idiomatic equivalents.

**Rules:**
- Only simplify files touched by the current task — do NOT refactor unrelated code
- Changes must be behavior-preserving (no new features, no bug fixes)
- If no simplification opportunities found, skip — do not force changes
- After applying simplifications, re-run the task's verification to confirm nothing broke
- Track simplifications as part of the task (not as separate deviations)

**Skip if:** Task only modifies config files, documentation, or has fewer than 10 lines of code changes.
</step>

<authentication_gates>

## Authentication Gates
Expand Down Expand Up @@ -270,6 +289,51 @@ TASK_COMMITS+=("Task ${TASK_NUM}: ${TASK_COMMIT}")

</task_commit>

<wave_code_review>
## Post-Wave Code Review Gate

After ALL tasks in a wave complete (all committed), run a 2-stage code review on the wave's changes before proceeding to the next wave.

**1. Identify wave changes:**
```bash
# Get the diff for all commits in this wave
WAVE_FIRST_COMMIT=$(echo "${TASK_COMMITS[0]}" | awk '{print $NF}')
git diff ${WAVE_FIRST_COMMIT}^..HEAD --name-only
Comment on lines +300 to +301
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<wave_code_review> says it reviews only the current wave, but WAVE_FIRST_COMMIT is derived from TASK_COMMITS[0]. Since TASK_COMMITS is only ever appended to in this template and never reset per wave, this diff range will start from the very first task of the entire run, not the wave. Consider resetting the commit list at wave start or tracking a per-wave first commit hash explicitly so the review diff is scoped correctly.

Suggested change
WAVE_FIRST_COMMIT=$(echo "${TASK_COMMITS[0]}" | awk '{print $NF}')
git diff ${WAVE_FIRST_COMMIT}^..HEAD --name-only
# WAVE_FIRST_COMMIT must be set to the first commit hash of this wave at wave start, e.g.:
# WAVE_FIRST_COMMIT=$(git rev-parse --short HEAD)
git diff "${WAVE_FIRST_COMMIT}^"..HEAD --name-only

Copilot uses AI. Check for mistakes.
```

**2. Stage 1 — Spec Compliance:**
Review each task's implementation against its `<done>` criteria from the plan:
- Are all done criteria actually met (not just claimed)?
- Do implementations match the task specifications?
- Are there gaps between what was specified and what was built?

**On PASS:** Proceed to Stage 2.
**On FAIL:** Fix blocking issues inline, re-run affected task verification, re-commit fixes.

**3. Stage 2 — Code Quality:**
Review the wave's changed files for:
- Consistent naming conventions and code style
- Proper error handling on all new code paths
- No hardcoded values that should be configurable
- No security issues (exposed secrets, injection vectors, missing auth checks)

**On PASS:** Wave complete — proceed to next wave.
**On FAIL:** Fix issues inline, re-verify, re-commit fixes.

**4. Record review verdict in wave notes:**
```
Wave {N} Review: PASS (spec: pass, quality: pass)
```
Or with issues:
```
Wave {N} Review: PASS after fixes (spec: 1 fix, quality: 2 fixes)
```

**Max retries:** 2 per stage. After 2 retries still failing: flag in SUMMARY.md under "Wave Review Issues", continue to next wave.

**Skip if:** Wave contains only a single documentation or config task.
</wave_code_review>

<step name="checkpoint_protocol">
On `type="checkpoint:*"`: automate everything possible first. Checkpoints are for verification/decisions only.

Expand Down