feat(p6): skill layer — orchestration patterns for the builder by garniergeorges · Pull Request #83 · garniergeorges/agent-forge

garniergeorges · 2026-04-27T12:52:26Z

Summary

P6 (early). Adds a skill layer that lets the builder load high-level orchestration patterns on demand instead of stopping mid-intent. Ships one built-in skill — `scaffold-and-run` — that fixes the most visible failure mode today : when the user describes BOTH the agent AND a concrete task in the same message, the builder writes the AGENT.md and stops. With this skill loaded, it emits a `forge:write` and a `forge:run` in the same turn.

How it works

Step	Where
Discover SKILL.md files	`packages/core/src/builder/skill-catalog.ts` reads built-ins from the package + user skills from `~/.agent-forge/skills/`
Advertise to the LLM	The system prompt gains an `AVAILABLE SKILLS` / `SKILLS DISPONIBLES` section with name + description + triggers (bodies excluded to keep tokens low)
LLM activates a skill	Emits a fenced `forge:skill` block with the skill name
CLI loads the body	The catalog resolves the body, the action card lands in Mission Control as DONE, and the body is appended as a system message in the same turn
Next turn	The builder sees the full skill instructions and follows them

Files

core

`src/types/skill-md.ts` — Zod schema + parser for SKILL.md
`src/builder/skill-catalog.ts` — loader (built-in + user, user wins on name collision, sorted)
`src/builder/skills/scaffold-and-run.md` — first built-in
`src/builder/system-prompt.ts` — `SkillCatalogEntry` parameter, FR/EN section
`src/builder/stream.ts` — passes `skills` through

cli

`src/builder-actions.ts` — `forge:skill` block parser, `SkillActionExecution` type, `executeAction(skill, { resolveSkill })`
`src/actions/types.ts` — `SkillAction` (auto-running, no permission)
`src/components/MissionControl.tsx` — `SkillCard` (description + green tick when loaded)
`src/components/CardDetail.tsx` — full body view for skill actions
`src/hooks/useChat.ts` — loads the catalog once, threads it to `streamBuilder` and `executeAction`, injects the resolved body as a system message in the same turn
`src/commands.ts` — `/skills` lists what's available

Tests

SKILL.md schema validation (kebab-case, missing frontmatter, unknown action tag)
Catalog loader (built-in present, sorted)
System prompt (no SKILLS section when empty, FR/EN headers when present)
`forge:skill` parser (`name:` prefix, bare line, kebab-case rejection)
`executeAction(skill)` round-trip + missing skill case

Test plan

`bun install && bun test` passes
`bun run forge` boots, `/skills` lists `scaffold-and-run · built-in`
User says : "crée et lance un agent qui audite un projet TypeScript" → builder emits `forge:skill` then a `forge:write` AND a `forge:run` in the SAME turn
Three cards appear in Mission Control : skill (DONE), write (PROPOSED), run (PROPOSED) ; the user approves write and run in order
Tab-cycling reaches the skill card ; Enter shows its full body in detail
check-attribution CI is green

Out of scope

Skill-to-skill chaining (decided out for P6, possibly later)
Hot-reload skills on filesystem change (`/skills reload` command)
Documenting skill authoring in user-facing docs (P7+)

…lder The builder now has access to a catalog of skills : self-contained behaviour modules that orchestrate multiple actions in a single turn to handle recurring intent patterns. First built-in : scaffold-and-run, which fixes the "user describes both creation and execution but the builder stops after writing AGENT.md" pattern by making the LLM emit a forge:write AND a forge:run in the same turn. Architecture : - SKILL.md format with YAML frontmatter (name, description, triggers, actions) and a markdown body containing the instructions. Mirrors AGENT.md to stay familiar. - Catalog loader discovers skills from two sources : built-ins shipped in packages/core/src/builder/skills/, plus user skills under ~/.agent-forge/skills/. User skills override built-ins on name collision. - System prompt only carries the catalog metadata (name + description + triggers) — bodies stay out of the context until the LLM emits a forge:skill block, then the resolved body is injected as a system message for the next turn. - New ParsedAction kind 'skill' with a forge:skill fenced block parser ; tolerant of either `name: <skill>` or a bare line. - SkillAction joins WriteAction/RunAction in the action store. Skills auto-execute (no permission dialog) and surface as their own card in Mission Control with a "loaded into context" hint. CardDetail renders the description plus the full body so the user can see what the skill actually injects. Other : - /skills slash command lists what's available, source-tagged. - useChat resolves the catalog once via useMemo, threads it to streamBuilder and to executeAction's resolveSkill. Tests : - SKILL.md schema (kebab-case name, missing frontmatter, unknown action tag). - Catalog loader discovers the built-in and sorts entries. - System prompt injects the SKILLS section when entries are provided, FR/EN headers, base prompt unchanged when empty. - forge:skill parser (name: prefix, bare line, kebab-case rejection) and executeAction(skill) round-trip via resolver.

Symptom : the user typed a message that matched a skill trigger ("audite un projet typescript") but the builder still skipped forge:skill and went straight to forge:write. The skill catalog was loaded, the system prompt mentioned it, but Mistral would not act on it. Two issues, both fixed here : 1. Position. The SKILLS section was appended AFTER the base prompt's "BE DECISIVE — propose the AGENT.md immediately" rule. Mistral read the strong push to write first, then a soft "you can also use a skill" at the bottom, and ignored the latter. 2. Framing. The original wording said "choose a skill when ...". Too permissive — small models read that as optional. Replaced with a STEP 0 / ÉTAPE 0 framing : an explicit, mandatory pre-flight check that runs BEFORE any other action. If any trigger phrase matches (case-insensitive substring), the LLM MUST emit a forge:skill block as the only action of that turn ; only then does the rest of the protocol apply. The catalog is now placed at the TOP of the system prompt, before the "be decisive" rule, so the order of reading mirrors the order of execution. Tests updated for the new wording. Triggers are now quoted in the catalog rendering ("audite", "teste") so the LLM sees them as literals rather than running prose.

Mistral Small reads the skill catalog and the STEP 0 instruction in the system prompt, but it does not act on it : it sees "audite a typescript project" and goes straight to a forge:write that collapses both the agent definition and the run mission into one giant AGENT.md body. Adding more rules to the prompt didn't move the needle. Plan B, in three pieces : 1. matchSkillForMessage() — case-insensitive substring match against the trigger phrases declared in each SKILL.md. Lives in core, no LLM involvement. 2. runScaffoldAndRun() — a dedicated runner that drives the skill end to end with TWO narrow LLM calls instead of one wide one : - call A : "produce ONLY the AGENT.md content" (generic role, no session-specific steps in the body) - call B : "produce ONLY the prompt to send to the agent" Each call has a tightly scoped system instruction so the model keeps the two artefacts cleanly separated. Output is parsed server-side, AGENT.md name extracted from the frontmatter. 3. useChat.send() — pre-flight before the normal stream : if the matcher finds a skill, dispatch to the runner. The skill card lands in Mission Control as DONE, then a write card and a run card appear as PROPOSED. The user approves them in order via the existing permission dialog. The system prompt no longer carries the STEP 0 / ÉTAPE 0 mandate. Skills are now an internal mechanism the LLM is informed about but never asked to operate. The catalog metadata stays in the prompt as a short tail note so the LLM understands why a skill card might appear in Mission Control. Tests : - matchSkillForMessage : substring match, no-match, multi-skill precedence (first wins), empty trigger ignored. - system prompt : informational note appears when skills are passed, FR/EN variants, base prompt comes first.

The detail screens (Esc-Tab-Enter on a Mission Control card) used to render every action body through highlightPlain — everything came back as undifferentiated grey, which made long AGENT.md / agent run outputs hard to scan. Replaced by per-shape highlighting : - skill detail : Markdown highlighter (headings, lists, inline code, bold, fenced blocks). The skill body is markdown, so this matches. - write detail : if the file has YAML frontmatter (which AGENT.md always does), split frontmatter and body. The frontmatter goes through the YAML highlighter (already existing), the body through the new Markdown one. Falls back to plain YAML for files without frontmatter. - run detail (and the compact run card in Mission Control) : a new highlightAgentRun() that walks the streamed output and recognises : · fenced ```forge:* blocks (open line orange, body via the matching language highlighter — JSON for forge:bash/write/etc.) · [forge:tool] / [/forge:tool] markers wrapping the result of the previous tool call (rendered dim grey so it visually recedes) · regular prose with inline code spans and bold New helpers in syntax.ts : highlightMarkdown(text) highlightAgentRun(text) highlightYamlLine / highlightJsonLine kept exported for the run highlighter to delegate to. The compact card in MissionControl now also uses highlightAgentRun so the streaming output during a long run reads the same as the detail view, just clipped at maxLines.

When several agents run in a session, the panel was stacking 6+ fully-expanded cards and overflowing the terminal. Two changes : 1. Compact mode by default. Each non-focused card now renders as a single line : badge + verb + target. Borders disappear, the terminal stays calm. The focused card expands to its full preview like before. Cards in 'running' status stay expanded too, so a streaming agent run remains visible without having to Tab to it. 2. Bounded viewport. Mission Control now takes a panelHeight prop (computed by App from the terminal rows minus a Welcome floor and a spacer) and slices the action list to fit. Truncated actions show as "↑ N above / ↓ N below" hints in the panel header. Welcome stays glued to the bottom with flexShrink=0, so the panel is what gives way on small terminals. useCardFocus extended : - scrollTop : action-index offset, advanced via PgUp/PgDn ; - auto-focus the last new arrival when nothing is focused (the user immediately sees what the builder just produced) ; - auto-scroll lower bound : focusing an action above scrollTop bumps scrollTop down to keep it visible. App now routes PgUp/PgDn to the Mission Control scroll when there are actions and the prompt is empty (or a card is focused). It keeps falling back to the chat transcript scroll otherwise. Tab, Shift+Tab, Enter, Esc unchanged.

Status badge moves to P6 done. Roadmap table lifts P4 and P6 to ✅, points to P5 (hardened sandbox + persistent agents + artifact extraction) as the next milestone. Root README (EN/FR) gains : - Native tools section : six-tool table (Bash, FileWrite, FileRead, FileEdit, Grep, Glob) with their tags and limits ; a short note explaining the choice of a text-structured forge:* protocol over OpenAI tool_calls. - Skills section : SKILL.md format, two sources (built-in and ~/.agent-forge/skills/), server-side matcher, two-call runner, scaffold-and-run as the first built-in. - Mission Control keyboard cheatsheet : Tab / Enter / Esc / PgUp/PgDn / Ctrl+E. - /skills slash command listed. - Architecture diagram updated : skill catalog + runner on the host side, /workspace mount + tool loop on the container side, persistence of the workspace dir after exit. - Repo structure shows packages/core/src/builder/skills/, runtime/src/tool-protocol.ts, and the runtime/ subdir under tools-core. Sub-package READMEs realigned : - packages/cli : compact / expanded card mode, scrollable viewport, focus + auto-scroll, detail view, full keyboard map, dispatch skill server-side mention. - packages/core : skill catalog / matcher / runner files listed, scaffold-and-run noted as built-in. - packages/runtime : multi-turn tool loop documented, six forge:* tags, [forge:tool] markers on stdout, FORGE_MAX_TOKENS env var. - packages/tools-core : separate "host tools" and "runtime tools" sections ; six runtime tools with their constraints ; test layout listed.

garniergeorges added 6 commits April 27, 2026 14:51

garniergeorges merged commit 9d22fca into dev Apr 27, 2026
1 check passed

garniergeorges deleted the feat/p6-skills branch April 27, 2026 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(p6): skill layer — orchestration patterns for the builder#83

feat(p6): skill layer — orchestration patterns for the builder#83
garniergeorges merged 6 commits into
devfrom
feat/p6-skills

garniergeorges commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garniergeorges commented Apr 27, 2026

Summary

How it works

Files

core

cli

Tests

Test plan

Out of scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant