Summary
A skill that, given a set of work items, drives feature implementation end-to-end — offloading development and code review to sub-agents — and only pauses for input on human-in-the-loop (HITL) tasks or on blockers that can't be resolved without changing the planned spec/architecture.
Motivation (current flow)
I use han to implement small-to-medium features on a pet project. After the planning phase, han produces an implementation plan, but the implementation itself is manual: for each work item I hand-invoke tdd and then code-review.
That per-item loop is great for work units with high uncertainty, but it becomes tedious for long features that already have a good specification. Running tdd and code-review also consumes a lot of context, so I have to compact the session every few work items, which costs extra time.
Proposed behavior
A new skill that takes work items and drives implementation, offloading the heavy lifting (development via tdd, review via code-review) to sub-agents, pausing only for:
- HITL tasks, and
- blockers that require my explicit decision because they can't be resolved without changing the planned feature spec, architecture, or similar.
Prior art (I've reproduced this manually)
I've already reproduced the flow by instructing Claude to implement a set of work units while passing development and code review onto sub-agents. Two things were needed to make it reliable:
- I had to explicitly pass the skill names to the sub-agents, otherwise they wouldn't invoke
tdd / code-review.
- I had to give explicit per-task model instructions (in the end I just told it to use opus for everything).
Design direction
Where I have a direction, and where I don't:
- How work items are supplied — from
plan-work-items. In practice I commit the feature spec and instruct the agents to read it before proceeding.
- Per-work-item completion gate — tests green and
code-review reports no findings at severity medium or above. The severity threshold should probably be configurable; pinning down the exact mechanism needs a closer look at the code-review skill.
- How a HITL task is recognized —
plan-work-items already classifies each task as AFK or HITL; the skill keys off that.
- How a "blocker requiring an explicit decision" is detected — genuinely open. We'll need to try different approaches and see what works.
- Model-selection policy — to be decided later. Likely an optional skill input with an adaptive default. Worth researching the
superpowers skill pack, which does a fairly good job of this.
- Failure handling when a sub-agent's
tdd/code-review fails or returns findings — either escalate to me, or run a tdd → code-review → tdd → … loop with a hard cap on the iteration count.
Implementation considerations
- Claude workflows could orchestrate this automatically. Caveat: agents spawned via workflows can't use the
Agent tool, so skills that themselves rely on sub-agents (notably code-review) will need additional consideration.
Suspected areas
- New skill placement: likely a coding-time skill under
han-coding/skills/ (alongside tdd/code-review), or an orchestration skill bridging planning and coding.
- The
tdd and code-review skills (han-coding/skills/) — the per-item steps dispatched to sub-agents.
- The work-item pipeline:
plan-work-items / work-items.md and plan-implementation (han-planning/skills/) — the upstream that feeds this skill.
- Sub-agent dispatch and the agent roster (
han-core/agents/) — including the per-task model selection and explicit-skill-name concerns above.
Ask
Would you be interested in a PR contributing such a skill?
Summary
A skill that, given a set of work items, drives feature implementation end-to-end — offloading development and code review to sub-agents — and only pauses for input on human-in-the-loop (HITL) tasks or on blockers that can't be resolved without changing the planned spec/architecture.
Motivation (current flow)
I use han to implement small-to-medium features on a pet project. After the planning phase, han produces an implementation plan, but the implementation itself is manual: for each work item I hand-invoke
tddand thencode-review.That per-item loop is great for work units with high uncertainty, but it becomes tedious for long features that already have a good specification. Running
tddandcode-reviewalso consumes a lot of context, so I have to compact the session every few work items, which costs extra time.Proposed behavior
A new skill that takes work items and drives implementation, offloading the heavy lifting (development via
tdd, review viacode-review) to sub-agents, pausing only for:Prior art (I've reproduced this manually)
I've already reproduced the flow by instructing Claude to implement a set of work units while passing development and code review onto sub-agents. Two things were needed to make it reliable:
tdd/code-review.Design direction
Where I have a direction, and where I don't:
plan-work-items. In practice I commit the feature spec and instruct the agents to read it before proceeding.code-reviewreports no findings at severity medium or above. The severity threshold should probably be configurable; pinning down the exact mechanism needs a closer look at thecode-reviewskill.plan-work-itemsalready classifies each task as AFK or HITL; the skill keys off that.superpowersskill pack, which does a fairly good job of this.tdd/code-reviewfails or returns findings — either escalate to me, or run atdd→code-review→tdd→ … loop with a hard cap on the iteration count.Implementation considerations
Agenttool, so skills that themselves rely on sub-agents (notablycode-review) will need additional consideration.Suspected areas
han-coding/skills/(alongsidetdd/code-review), or an orchestration skill bridging planning and coding.tddandcode-reviewskills (han-coding/skills/) — the per-item steps dispatched to sub-agents.plan-work-items/work-items.mdandplan-implementation(han-planning/skills/) — the upstream that feeds this skill.han-core/agents/) — including the per-task model selection and explicit-skill-name concerns above.Ask
Would you be interested in a PR contributing such a skill?