feat(cli): skill factory — openkb skill new + /skill new + auto marketplace.json by KylinMountain · Pull Request #57 · VectifyAI/OpenKB

KylinMountain · 2026-05-18T05:40:00Z

Summary

New openkb skill new <name> "<prompt>" compiles the wiki into an
Anthropic-Skills directory at <kb>/output/skills/<name>/.
New /skill new slash command inside openkb chat — same primitive,
conversational front end, lets users iterate via natural-language
follow-ups (chat agent now has Write access to <kb>/output/** +
<kb>/wiki/explorations/**).
Auto-regenerated <kb>/.claude-plugin/marketplace.json lists all
compiled skills; pushing the KB to GitHub makes npx skills@latest add <owner>/<repo> work end-to-end.
Generator primitive (openkb/generator.py) is architected so future
ppt / podcast / report targets slot in without restructuring.
Community contribution path documented (CONTRIBUTING.md +
skill_submission PR template) — no new CLI surface needed for
submission.

Architecture

Reuses the existing openai-agents SDK + LiteLLM stack from openkb.agent.query.
The skill-compile agent is constructed via build_skill_compile_agent with
a system prompt from openkb/prompts/skill_compile.md interpolated with
the user's intent + skill name + wiki schema. Tools are scoped: read wiki,
query wiki, write only within <kb>/output/skills/<name>/. Marketplace.json
regenerates synchronously after the agent finishes (no LLM call).

Test plan

uv run pytest — full suite green (430+ tests)
CLI command name validation, KB detection, wiki-content gate,
overwrite logic, marketplace.json regen all covered by unit tests
Chat /skill new covered by 3 dedicated tests + write_kb_file
allow-list covered by 9 dedicated tests
Manual acceptance: run openkb skill new <name> "<intent>" -y
against a real KB with compiled wiki content, verify the resulting
output/skills/<name>/SKILL.md activates appropriately in Claude
Code after cp -r output/skills/<name> ~/.claude/skills/. Spec
criteria from docs/superpowers/specs/2026-05-18-skill-factory-design.md
§11 success criteria.
Iteration: in openkb chat, run /skill new, then ask the
agent to tweak the description or a references/ file; verify the
file actually changes on disk.

Spec: docs/superpowers/specs/2026-05-18-skill-factory-design.md (gitignored locally; design notes in PR review)

The previous c.islower() check accepted Unicode lowercase letters like 'é' and 'ü', contradicting the [a-z0-9-] docstring promise. Names are used as filesystem directories and YAML frontmatter, so the slug must stay ASCII. Add explicit a-z / 0-9 range check + 2 regression tests.

The plan-verbatim implementation omitted the 'owner' field at the top level. Some marketplace consumers (Claude Code's /plugin marketplace add) expect 'owner' for listing/attribution. Derive from git config; fall back to 'openkb-user' if git isn't configured. Mirror as plugin-level 'author' to match the existing repo convention.

The plan called for verifying that wiki/concepts/ or wiki/summaries/ has actual files before allowing skill compilation. Earlier impl loosened this to 'any file in wiki/', which silently accepted freshly-init'd KBs (openkb init pre-creates empty concepts/ + summaries/ dirs). Restore the strict check + populate test fixture + add regression test for the freshly-init'd case.

- /skill new: catch shlex.split ValueError on unclosed quotes so a typo doesn't crash the chat REPL - write_kb_file: reject bare directory paths (e.g. 'output') that would otherwise raise IsADirectoryError on write_text - chat.py: drop stale build_query_agent import (chat now uses build_chat_agent) - test_chat_slash_commands.py: update patch target from build_query_agent -> build_chat_agent so the test exercises the right symbol - Add tests/test_write_kb_file.py covering allow-list, traversal, bare-directory rejection

C1: /skill new in chat had only name validation — no wiki-exists check, no wiki-content check, no overwrite guard. The CLI had all three. Extract the gates into _preflight_skill_new and call from both. Add explicit 'remove existing skill first' message in chat (no -y equivalent there). C2: System prompt advertised tool names (list_wiki_dir, read_wiki_file, write_skill_file) that didn't match what was registered with @function_tool (list_wiki, read_wiki, write_skill). LLM saw the registered names; prompt references would confuse it. Rename the wrappers to match. I1: query_wiki was a sync @function_tool calling asyncio.run() on run_query — works only because openai-agents SDK runs sync tools on worker threads. Convert to async @function_tool so the runner awaits it in the same loop, eliminating the nested-asyncio fragility. Add 2 regression tests for the chat safety gates.

The previous impl used the KB directory name as both the marketplace 'name' and the plugin 'name', and stitched together a metadata description by truncating the first skill's SKILL.md description at 200 chars (often mid-word). Lock the convention to match skills/openkb/.claude-plugin/marketplace.json from the official skill: - marketplace name: 'vectify' (always) - plugin name: 'openkb' (always) - description: fixed string, no SKILL.md content injection, no truncation Different KBs are distinguished by <owner>/<repo> URL, not manifest name. Users get one canonical install command (/plugin install openkb@vectify) regardless of which KB they're consuming. Also fix _git_owner to pass cwd=kb_dir so 'openkb --kb-dir ... skill new' run from anywhere reads the KB's git config, not the process CWD.

KylinMountain · 2026-05-18T07:12:39Z

Code review

Found 1 issue:

MaxTurnsExceeded is not caught — agent hitting MAX_TURNS = 80 produces an unhandled traceback instead of a friendly error. MaxTurnsExceeded inherits from AgentsException, not RuntimeError, but both call sites (CLI and chat) only catch RuntimeError. Either widen the catch to Exception (or AgentsException + RuntimeError), or have run_skill_compile translate the SDK exception into a RuntimeError with a clear message.

OpenKB/openkb/agent/skill_compiler.py

Lines 124 to 131 in 7467200

    
           # Single user message kicks off the compile. The system prompt already 
        
           # contains the intent — this just nudges the agent to start. 
        
           seed = ( 
        
               f"Compile the skill '{skill_name}'. Follow the system prompt's " 
        
               f"working method. Read the wiki, then write the skill files." 
        
           ) 
        
           await Runner.run(agent, seed, max_turns=MAX_TURNS)

Call sites that catch only RuntimeError:

OpenKB/openkb/cli.py

Lines 1490 to 1495 in 7467200

    
                   kb_dir=kb_dir, 
        
                   model=model, 
        
               ) 
        
               asyncio.run(gen.run()) 
        
           except RuntimeError as exc: 
        
               click.echo(f"[ERROR] {exc}", err=True)

OpenKB/openkb/agent/chat.py

Lines 540 to 546 in 7467200

    
           # Load model from KB config 
        
           from openkb.config import load_config, DEFAULT_CONFIG 
        
           config = load_config(kb_dir / ".openkb" / "config.yaml") 
        
           model = config.get("model", DEFAULT_CONFIG["model"]) 
        
           from openkb.generator import Generator 
        
           _fmt(style, ("class:slash.help", f"Compiling skill '{name}'...\n"))

Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

…ator - Rename openkb/agent/skill_compiler.py → skill_creator.py (and associated symbols + prompt file). The existing openkb/agent/compiler.py owns 'compile' for raw → wiki; this module generates a skill from compiled wiki content, which is closer to 'create' / 'distill'. Disambiguates the verb in one package. - Translate MaxTurnsExceeded into a friendly RuntimeError inside run_skill_create. Both the CLI and chat call sites only catch RuntimeError; the SDK exception leaked a Python traceback before. - Defer the rmtree-on-overwrite until after _setup_llm_key and load_config succeed. Previously an unset API key would wipe the existing skill output with nothing to replace it. - Fix marketplace.py module docstring: don't claim chat-side SKILL.md edits regenerate the manifest (they don't). - Drop unused yes_flag from _preflight_skill_new + rewrite its docstring to match what the function actually does. - Clean up github-code-quality bot findings: unused pytest imports in 2 test files, unused 'manifest' local in test_marketplace.py (replaced with the assertion the test intended), redundant in-function stdlib imports in openkb/cli.py. Add 2 regression tests: - test_run_skill_create_translates_max_turns_to_runtime_error - test_skill_new_keeps_existing_skill_when_key_setup_fails

Reframe the project mental model: the wiki is the substrate, and several primitives (query, chat, skill new) generate output from it. Add 'Drop in a book. Out comes a digital expert.' slogan to the Skill Factory subsection. - Features list split into 'Wiki foundation' (compile + maintain) and 'Generators' (query / chat / Skill Factory) - Quick Start adds step 6 — distill a skill - Architecture diagram extended to show the wiki branching into query/chat + Skill Factory + future generators (ppt/podcast/…) - Usage section regrouped under 🧱 Wiki Foundation and ✨ Generators - Skill Factory subsection promoted with the slogan as its heading and a concrete example folder tree - Chat slash command list updated to include /skill new

Borrows from Anthropic's skill-creator: each time 'openkb skill new -y' would overwrite an existing skill, the old version is copied to <kb>/output/skills/<name>-workspace/iteration-N/ instead of being destroyed. Iteration numbers monotonically increase. Each iteration carries a diff.md showing description + reference-file delta vs the previous version. New commands: openkb skill history <name> list past iterations openkb skill rollback <name> restore latest iteration openkb skill rollback --to N restore specific iteration Tests cover: iteration numbering, restore-from-N, restore-from-latest, diff content (description / added refs / removed refs).

Borrows from Codex skill-creator's quick_validate.py: catches the common failure modes that would make a skill unloadable before distribution — missing/malformed frontmatter, name/dir mismatch, oversized files, broken references/* wikilinks, non-stdlib imports in scripts/* (strict mode). New CLI: openkb skill validate validate all compiled skills openkb skill validate <name> validate one openkb skill validate --strict treat warnings as failures openkb skill new auto-runs validation after compile and surfaces errors + warnings so the user knows whether the freshly-compiled skill is well-formed. Doesn't block marketplace.json regeneration — the files are on disk and the user can fix or rollback.

Borrows from Anthropic skill-creator's evaluation loop, simplified for v0.3: measure whether a skill's description: field actually fires when it should and stays quiet when it shouldn't. The description is the activation signal other agents read; a vague description silently fails to load when it ought to. Flow: 1. LLM generates N should-trigger + N should-not prompts from the description only 2. Grader LLM scores each: 'should this description activate this skill for this question?' 3. Compare to ground truth, print pass rate + misses New CLI: openkb skill eval <name> run eval (10+10 default) openkb skill eval <name> --save persist prompts to disk openkb skill eval <name> --eval-set X reuse saved prompts openkb skill eval <name> --count N override prompt count Tests mock Runner.run for both generator and grader — no real LLM calls in CI. Saved eval sets live at .openkb/eval-sets/<name>.json for reproducibility.

Skill Factory section now lists the 4 quality-gate commands borrowed from Codex (validate) and Anthropic skill-creator (eval + iteration): openkb skill validate structural lint (frontmatter, sizes, refs) openkb skill eval trigger-accuracy test of the description openkb skill history list past iterations openkb skill rollback restore a previous iteration The slogan promised a 'digital expert' — these commands are what makes the output worth that label.

quanqigu · 2026-05-18T08:30:57Z

Code review (round 2, on commits `7467200`..`a5fe567`)

Found 1 issue:

skill eval re-introduces the MaxTurnsExceeded/JSONDecodeError traceback leak that round 1 caught for skill new — openkb/cli.py:skill_eval only catches RuntimeError, but run_eval calls Runner.run (raises MaxTurnsExceeded, an AgentsException, not RuntimeError) and json.loads on LLM output (raises JSONDecodeError, a ValueError, not RuntimeError). Same bug class fixed in skill_creator.run_skill_create by translating SDK exceptions to RuntimeError; skill_evaluator needs the same shim, or the CLI catch must widen to Exception.

https://github.com/VectifyAI/OpenKB/blob/a5fe567c34a4ed5e1c0c842a6ed35a5d8d6cb01f/openkb/cli.py#L1771-L1781

https://github.com/VectifyAI/OpenKB/blob/a5fe567c34a4ed5e1c0c842a6ed35a5d8d6cb01f/openkb/skill_evaluator.py#L120-L135

Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

Code review round-2 flagged the eval pipeline reintroducing the MaxTurnsExceeded/JSONDecodeError traceback leak that round-1 caught for skill new. Apply the same shim inside skill_evaluator + 4 other carryover items: - Translate MaxTurnsExceeded and json.JSONDecodeError to RuntimeError inside generate_eval_set and grade_one. CLI catch (RuntimeError) now covers both. - Wrap _setup_llm_key in skill_eval with the same try/except/exit pattern as skill_new / query / chat. - Move openkb/skill_evaluator.py -> openkb/agent/skill_evaluator.py. Modules that construct Agent live under openkb/agent/ per repo convention; top-level openkb/ keeps marketplace + generator (no agents SDK). - Validator: reject '<' / '>' in description (Anthropic parser requirement); warn on unknown frontmatter keys (Anthropic spec allows a fixed set). - Drop redundant in-function 'import asyncio' from skill_eval (already at module top). - Drop unused EvalMiss import from tests. - Validator module docstring updated to enumerate all checks. Also delete community contribution scaffolding (CONTRIBUTING.md + .github/PULL_REQUEST_TEMPLATE/skill_submission.md) - premature for the project's current stage; will revisit when real contributors arrive.

All skill-related code now lives in a single openkb/skill/ subpackage (7 modules + __init__). Drop the redundant 'skill_' prefix on filenames since the package qualifies them already. Moves: openkb/generator.py -> openkb/skill/generator.py openkb/marketplace.py -> openkb/skill/marketplace.py openkb/skill_validator.py -> openkb/skill/validator.py openkb/skill_workspace.py -> openkb/skill/workspace.py openkb/agent/skill_creator.py -> openkb/skill/creator.py openkb/agent/skill_tools.py -> openkb/skill/tools.py openkb/agent/skill_evaluator.py -> openkb/skill/evaluator.py generator.py + marketplace.py go under skill/ for now (v0.x only has skills); they're nominally generic primitives but YAGNI -- when a second artifact type (ppt / podcast / report) actually lands, those two will move back out to openkb/<shared>/. No behavioral changes. All imports + test patch targets updated. Test suite stays at 494 passed.

The Features list at the top duplicated everything already covered by Quick Start (step-by-step feature tour) + Usage (Wiki Foundation + Generators tables). It also duplicated the Skill Factory slogan that now lives canonically in the Skill Factory subsection. Replace 17 lines with a one-sentence pointer to Usage. 405 → 389 lines. Slogan now appears exactly once (in the Skill Factory subsection) instead of 5 places. Other duplicates (PageIndex mentions, cross-CLI install instructions, Karpathy comparison table) left for now — they target different audiences (scanners vs deep readers) and aren't worth touching in this pass.

KylinMountain added 14 commits May 18, 2026 12:55

feat(skill): scaffold openkb/prompts/ for static system prompts

06ed99b

feat(skill): path-scoped IO tools for the skill compile agent

4f1db47

feat(skill): kebab-case skill name validation

fd26e26

feat(skill): skill-compile system prompt template

f007a3d

feat(skill): build skill-compile agent + run loop

6bbfb98

feat(skill): regenerate per-KB marketplace.json from output/skills/

9cbc5c6

feat(skill): Generator primitive for output/* artifacts

0fdf0c4

feat(skill): openkb skill new CLI command

3c55703

feat(skill): /skill new slash command + chat write tool extension

44459a1

docs(skill): README + CONTRIBUTING + PR template for skill submissions

c961e9a

github-code-quality Bot found potential problems May 18, 2026

View reviewed changes

Comment thread tests/test_marketplace.py Fixed

Comment thread openkb/cli.py Fixed

Comment thread tests/test_skill_cli.py Fixed

Comment thread tests/test_skill_tools.py Fixed

github-code-quality Bot found potential problems May 18, 2026

View reviewed changes

Comment thread openkb/cli.py Fixed

github-code-quality Bot found potential problems May 18, 2026

View reviewed changes

Comment thread tests/test_marketplace.py Fixed

KylinMountain added 6 commits May 18, 2026 15:33

github-code-quality Bot found potential problems May 18, 2026

View reviewed changes

Comment thread openkb/cli.py Fixed

Comment thread tests/test_skill_evaluator.py Fixed

KylinMountain added 2 commits May 18, 2026 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli): skill factory — openkb skill new + /skill new + auto marketplace.json#57

feat(cli): skill factory — openkb skill new + /skill new + auto marketplace.json#57
KylinMountain wants to merge 25 commits into
mainfrom
feat/skill-factory

KylinMountain commented May 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KylinMountain commented May 18, 2026

Uh oh!

Uh oh!

Uh oh!

quanqigu commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

KylinMountain commented May 18, 2026

Summary

Architecture

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KylinMountain commented May 18, 2026

Code review

Uh oh!

Uh oh!

Uh oh!

quanqigu commented May 18, 2026

Code review (round 2, on commits 7467200..a5fe567)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Code review (round 2, on commits `7467200`..`a5fe567`)