Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
06ed99b
feat(skill): scaffold openkb/prompts/ for static system prompts
KylinMountain May 18, 2026
4f1db47
feat(skill): path-scoped IO tools for the skill compile agent
KylinMountain May 18, 2026
fd26e26
feat(skill): kebab-case skill name validation
KylinMountain May 18, 2026
11b52ca
fix(skill): _validate_skill_name now enforces ASCII slug strictly
KylinMountain May 18, 2026
f007a3d
feat(skill): skill-compile system prompt template
KylinMountain May 18, 2026
6bbfb98
feat(skill): build skill-compile agent + run loop
KylinMountain May 18, 2026
9cbc5c6
feat(skill): regenerate per-KB marketplace.json from output/skills/
KylinMountain May 18, 2026
e12dfd2
fix(skill): include owner/author in marketplace.json from git config
KylinMountain May 18, 2026
0fdf0c4
feat(skill): Generator primitive for output/* artifacts
KylinMountain May 18, 2026
3c55703
feat(skill): openkb skill new CLI command
KylinMountain May 18, 2026
e54fe8d
fix(skill): restore strict wiki-content check in skill new
KylinMountain May 18, 2026
44459a1
feat(skill): /skill new slash command + chat write tool extension
KylinMountain May 18, 2026
4e99089
fix(skill): harden /skill new slash + write_kb_file edge cases
KylinMountain May 18, 2026
c961e9a
docs(skill): README + CONTRIBUTING + PR template for skill submissions
KylinMountain May 18, 2026
439d017
fix(skill): align tool names + add safety gates to chat /skill new
KylinMountain May 18, 2026
7467200
fix(skill): lock marketplace naming to openkb@vectify convention
KylinMountain May 18, 2026
5d76d58
refactor(skill): address PR #57 review feedback + rename compiler→cre…
KylinMountain May 18, 2026
dbd78e6
docs(readme): restructure into Wiki Foundation + Generators layers
KylinMountain May 18, 2026
10c19d8
feat(skill): iteration workspace — preserve history, rollback support
KylinMountain May 18, 2026
7066e83
feat(skill): pure-Python structural validator + auto-run on compile
KylinMountain May 18, 2026
6f70d02
feat(skill): trigger-accuracy evaluator (skill eval)
KylinMountain May 18, 2026
a5fe567
docs(readme): document validate / eval / history / rollback commands
KylinMountain May 18, 2026
bc916bc
fix(skill): round-2 review fixes + drop community docs
KylinMountain May 18, 2026
a4e29b1
refactor(skill): consolidate skill modules under openkb/skill/
KylinMountain May 18, 2026
1679c8c
docs(readme): drop duplicate Features section
KylinMountain May 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 91 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,17 +22,7 @@ The idea is based on a [concept](https://x.com/karpathy/status/20398056595256445

Traditional RAG rediscovers knowledge from scratch on every query. Nothing accumulates. OpenKB compiles knowledge once into a persistent wiki, then keeps it current. Cross-references already exist. Contradictions are flagged. Synthesis reflects everything consumed.

### Features

- **Broad format support** — PDF, Word, Markdown, PowerPoint, HTML, Excel, text, and more via markitdown
- **Scale to long documents** — Long and complex documents are handled via [PageIndex](https://github.com/VectifyAI/PageIndex) tree indexing, enabling accurate, vectorless long-context retrieval
- **Native multi-modality** — Retrieves and understands figures, tables, and images, not just text
- **Compiled Wiki** — LLM manages and compiles your documents into summaries, concept pages, and cross-links, all kept in sync
- **Query** — Ask questions (one-off) against your wiki. The LLM navigates your compiled knowledge to answer
- **Interactive Chat** — Multi-turn conversations with persisted sessions you can resume across runs
- **Lint** — Health checks find contradictions, gaps, orphans, and stale content
- **Watch mode** — Drop files into `raw/`, wiki updates automatically
- **Obsidian compatible** — Wiki is plain `.md` files with `[[wikilinks]]`. Open in Obsidian for graph view and browsing
OpenKB has two layers: a **wiki foundation** that compiles and maintains your knowledge, and **generators** (query / chat / Skill Factory) that turn it into useful output. See [Usage](#️-usage) for the full command list.

# 🚀 Getting Started

Expand Down Expand Up @@ -80,6 +70,9 @@ openkb query "What are the main findings?"

# 5. Or chat interactively
openkb chat

# 6. Or distill your wiki into a redistributable skill
openkb skill new my-expert "Reason like an expert on <topic-from-your-docs>"
```

### Set up your LLM
Expand Down Expand Up @@ -109,7 +102,7 @@ raw/ You drop files here
│ Wiki Compilation (using LLM)
│ │
▼ ▼
wiki/
wiki/ │ ← the foundation
├── index.md Knowledge base overview
├── log.md Operations timeline
├── AGENTS.md Wiki schema (LLM instructions)
Expand All @@ -118,6 +111,13 @@ wiki/
├── concepts/ Cross-document synthesis ← the good stuff
├── explorations/ Saved query results
└── reports/ Lint reports
┌──────────────────────┼──────────────────────┐
▼ ▼ ▼
query / chat Skill Factory (future)
(LLM answers from openkb skill new ppt / podcast /
the wiki) → output/skills/ report / …
+ marketplace.json
```

### Short vs. Long Document Handling
Expand All @@ -144,15 +144,15 @@ A single source might touch 10-15 wiki pages. Knowledge accumulates: each docume

# ⚙️ Usage

### Commands
OpenKB commands fall into two layers: the **wiki foundation** (compile + manage your knowledge) and **generators** (turn that wiki into useful output).

## 🧱 Wiki Foundation — compile and maintain

| Command | Description |
|---|---|
| `openkb init` | Initialize a new knowledge base (interactive) |
| <code>openkb&nbsp;add&nbsp;&lt;file_or_dir_or_URL&gt;</code> | Add documents and compile to wiki. URL ingest auto-detects PDF (saved as `.pdf` → PageIndex / markitdown) vs HTML (trafilatura main-content extract → `.md`) |
| <code>openkb&nbsp;remove&nbsp;&lt;doc&gt;</code> | Remove a document and clean up its wiki pages, images, registry, and PageIndex state (use `--dry-run` to preview, `--keep-raw` / `--keep-empty-concepts` to retain artifacts) |
| <code>openkb&nbsp;query&nbsp;"question"</code> | Ask a question over the knowledge base (use `--save` to save the answer to `wiki/explorations/`) |
| `openkb chat` | Start an interactive multi-turn chat (use `--resume`, `--list`, `--delete` to manage sessions) |
| `openkb watch` | Watch `raw/` and auto-compile new files |
| `openkb lint` | Run structural + knowledge health checks |
| `openkb list` | List indexed documents and concepts |
Expand All @@ -161,11 +161,26 @@ A single source might touch 10-15 wiki pages. Knowledge accumulates: each docume

<!-- | `openkb lint --fix` | Auto-fix what it can | -->

### Interactive Chat
## ✨ Generators — turn the wiki into output

A "generator" reads from the compiled wiki and produces something usable: an answer, a conversation, a skill folder. The wiki is the substrate; generators are the surfaces.

| Command | Output |
|---|---|
| <code>openkb&nbsp;query&nbsp;"question"</code> | A grounded answer with citations (use `--save` to persist to `wiki/explorations/`) |
| `openkb chat` | Interactive multi-turn session over the wiki (use `--resume`, `--list`, `--delete` to manage sessions) |
| <code>openkb&nbsp;skill&nbsp;new&nbsp;&lt;name&gt;&nbsp;"&lt;intent&gt;"</code> | A redistributable Anthropic Skill at `<kb>/output/skills/<name>/` + auto-updated `marketplace.json` |
| <code>openkb&nbsp;skill&nbsp;validate&nbsp;[name]</code> | Structural lint of compiled skills (frontmatter, file sizes, wikilinks, scripts/ stdlib check with `--strict`). Auto-runs at end of `skill new` |
| <code>openkb&nbsp;skill&nbsp;eval&nbsp;&lt;name&gt;</code> | Trigger-accuracy evaluation — does the `description:` field actually fire? LLM generates eval prompts; grader LLM scores activation. `--save` persists the eval set |
| <code>openkb&nbsp;skill&nbsp;history&nbsp;&lt;name&gt;</code> / <code>openkb&nbsp;skill&nbsp;rollback&nbsp;&lt;name&gt;</code> | Iteration workspace — every overwrite saves the previous version to `output/skills/<name>-workspace/iteration-N/` with a structural diff. Rollback restores any iteration |

`openkb chat` opens an interactive chat session over your wiki knowledge base. Unlike the one-shot `openkb query`, each turn carries the conversation history, so you can dig into a topic without re-typing context.
### Query & Chat — ask the wiki

`openkb query "..."` answers a single question. `openkb chat` is interactive — each turn carries history, so you can dig into a topic without re-typing context. Both use the same underlying wiki and the same retrieval primitives (PageIndex for long docs, direct concept reads for short).

```bash
openkb query "What does the literature say about attention scaling?"

openkb chat # start a new session
openkb chat --resume # resume the most recent session
openkb chat --resume 20260411 # resume by id (unique prefix works)
Expand All @@ -179,11 +194,70 @@ Inside a chat, type `/` to access slash commands (Tab to complete):
- `/status` — show knowledge base status
- `/list` — list all documents
- `/add <path>` — add a document or directory without leaving the chat
- `/skill new <name> "<intent>"` — compile a skill from this chat (see below)
- `/save [name]` — export the transcript to `wiki/explorations/`
- `/clear` — start a fresh session (the current one stays on disk)
- `/lint` — run knowledge base lint
- `/exit` — exit (Ctrl-D also works)

### 🛠 Skill Factory — *Drop in a book. Out comes a digital expert.*

The newest generator. `openkb skill new` distills any subset of your wiki into an [Anthropic Skill](https://docs.claude.com/en/docs/build-with-claude/skills) — a portable folder that **Claude Code, Codex CLI, Gemini CLI, and Cursor** all install and load natively. Drop in a book's worth of papers; out comes a specialist that other agents can call on.

```bash
openkb skill new karpathy-thinking \
"Reason about transformers and attention in Karpathy's style"
```

This produces:

```
<kb>/output/skills/karpathy-thinking/
├── SKILL.md # YAML frontmatter + when-to-use + approach
├── references/ # depth material the agent loads on demand
│ ├── methodology.md
│ └── key-quotes.md
└── (scripts/) # optional, only if intent implies computation
```

…plus an auto-updated `<kb>/.claude-plugin/marketplace.json` so the whole KB is one-line installable.

**Install locally:**

```bash
cp -r output/skills/karpathy-thinking ~/.claude/skills/
```

**Share with others** — push your KB to GitHub, then anyone runs:

```bash
npx skills@latest add <your-org>/<your-repo>
```

**Iterate from chat** — compilation is one-shot, but follow-up edits aren't. Inside `openkb chat`, you can refine without re-running the whole pipeline:

```
/skill new karpathy-thinking "Reason about transformers like Karpathy"
[generation streams]
> description is too generic, make it about transformer implementations specifically
[agent edits SKILL.md frontmatter in place]
```

**Quality gates** — borrowing from [Codex skill-creator](https://github.com/openai/skills) (structural validation) and [Anthropic skill-creator](https://github.com/anthropics/skills/tree/main/skills/skill-creator) (trigger-accuracy evals):

```bash
# Lint structure (auto-runs at end of `skill new`)
openkb skill validate karpathy-thinking
openkb skill validate --strict # treat warnings as failures

# Does the description actually fire when it should?
openkb skill eval karpathy-thinking --save

# History + rollback if a new iteration regresses
openkb skill history karpathy-thinking
openkb skill rollback karpathy-thinking --to 2
```

### Configuration

Settings are initialized by `openkb init`, and stored in `.openkb/config.yaml`:
Expand Down
80 changes: 76 additions & 4 deletions openkb/agent/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
from prompt_toolkit.styles import Style

from openkb.agent.chat_session import ChatSession
from openkb.agent.query import MAX_TURNS, build_query_agent
from openkb.agent.query import MAX_TURNS, build_chat_agent
from openkb.log import append_log


Expand Down Expand Up @@ -59,6 +59,7 @@
" /list List all documents in the knowledge base\n"
" /lint Lint the knowledge base\n"
" /add <path> Add a document or directory to the knowledge base\n"
' /skill new <name> "<intent>" Compile a skill from the wiki\n'
" /help Show this"
)

Expand Down Expand Up @@ -214,6 +215,7 @@ def _bottom_toolbar(session: ChatSession) -> FormattedText:
("/list", "List all documents"),
("/lint", "Lint the knowledge base"),
("/add", "Add a document or directory"),
("/skill", "compile a skill (try `/skill new <name> \"intent\"`)"),
]


Expand Down Expand Up @@ -494,6 +496,73 @@ async def _run_add(arg: str, kb_dir: Path, style: Style) -> None:
await asyncio.to_thread(add_single_file, target, kb_dir)


async def _handle_slash_skill(arg: str, kb_dir: Path, style: Style) -> None:
"""Dispatch ``/skill new <name> "<intent>"`` and any future skill subcommands."""
import shlex

try:
parts = shlex.split(arg) if arg else []
except ValueError as exc:
_fmt(style, ("class:error", f"[ERROR] Could not parse: {exc}\n"))
return
if not parts:
_fmt(style, ("class:error", "Usage: /skill new <name> \"<intent>\"\n"))
return

sub = parts[0].lower()
if sub != "new":
_fmt(style, ("class:error", f"Unknown skill subcommand: {sub}. Try /skill new.\n"))
return

if len(parts) < 3:
_fmt(style, ("class:error", "Usage: /skill new <name> \"<intent>\"\n"))
return

name = parts[1]
intent = " ".join(parts[2:])

# Use the same safety gates as the CLI (name validation, wiki dir,
# wiki content). Chat doesn't have a -y flag, so existing skills
# block with a clear instruction to delete first.
from openkb.cli import _preflight_skill_new
err = _preflight_skill_new(kb_dir, name)
if err:
_fmt(style, ("class:error", f"[ERROR] {err}\n"))
return

target = kb_dir / "output" / "skills" / name
if target.exists():
_fmt(style, ("class:error",
f"[ERROR] output/skills/{name}/ already exists. Remove it first "
f"with `rm -rf output/skills/{name}` and re-run.\n"))
return

# Load model from KB config
from openkb.config import load_config, DEFAULT_CONFIG
config = load_config(kb_dir / ".openkb" / "config.yaml")
model = config.get("model", DEFAULT_CONFIG["model"])

from openkb.skill.generator import Generator
_fmt(style, ("class:slash.help", f"Compiling skill '{name}'...\n"))
try:
gen = Generator(
target_type="skill",
name=name,
intent=intent,
kb_dir=kb_dir,
model=model,
)
await gen.run()
except RuntimeError as exc:
_fmt(style, ("class:error", f"[ERROR] {exc}\n"))
return

_fmt(style, ("class:slash.ok", f"Saved: output/skills/{name}/\n"))
_fmt(style, ("class:slash.help",
f"Iterate: ask follow-up questions in this chat and the agent can "
f"edit files under output/skills/{name}/ directly.\n"))


async def _handle_slash(
cmd: str,
kb_dir: Path,
Expand Down Expand Up @@ -557,6 +626,10 @@ async def _handle_slash(
await _run_add(arg, kb_dir, style)
return None

if head == "/skill":
await _handle_slash_skill(arg, kb_dir, style)
return None

_fmt(
style,
("class:error", f"Unknown command: {head}. Try /help.\n"),
Expand All @@ -579,8 +652,7 @@ async def run_chat(

config = load_config(kb_dir / ".openkb" / "config.yaml")
language = session.language or config.get("language", "en")
wiki_root = str(kb_dir / "wiki")
agent = build_query_agent(wiki_root, session.model, language=language)
agent = build_chat_agent(kb_dir, session.model, language=language)

_print_header(session, kb_dir, style)
if session.turn_count > 0:
Expand Down Expand Up @@ -620,7 +692,7 @@ async def run_chat(
return
if action == "new_session":
session = ChatSession.new(kb_dir, session.model, session.language)
agent = build_query_agent(wiki_root, session.model, language=language)
agent = build_chat_agent(kb_dir, session.model, language=language)
prompt_session = _make_prompt_session(session, style, use_color, kb_dir)
continue

Expand Down
44 changes: 43 additions & 1 deletion openkb/agent/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,12 @@
from agents import Agent, Runner, function_tool

from agents import ToolOutputImage, ToolOutputText
from openkb.agent.tools import get_wiki_page_content, read_wiki_file, read_wiki_image
from openkb.agent.tools import (
get_wiki_page_content,
read_wiki_file,
read_wiki_image,
write_kb_file,
)

MAX_TURNS = 50
from openkb.schema import get_agents_md
Expand Down Expand Up @@ -91,6 +96,43 @@ def get_image(image_path: str) -> ToolOutputImage | ToolOutputText:
)


def build_chat_agent(
kb_dir: Path,
model: str,
language: str = "en",
) -> Agent:
"""Build the chat agent: query agent + a write tool restricted to
``<kb>/wiki/explorations/**`` and ``<kb>/output/**``.

This is the variant used by the interactive ``openkb chat`` REPL so users
can iterate on generated artifacts (e.g. ``output/skills/<name>/``) via
natural-language follow-ups without giving the agent unrestricted write
access to the wiki.
"""
wiki_root = str(kb_dir / "wiki")
kb_root = str(kb_dir)
base = build_query_agent(wiki_root, model, language=language)

@function_tool
def write_file(path: str, content: str) -> str:
"""Write a text file under the KB.

Allowed paths (relative to KB root):
* ``wiki/explorations/**`` — chat-derived notes.
* ``output/**`` — generator artifacts (skills, etc.).

Any other path is rejected. Parent directories are created.

Args:
path: File path relative to KB root
(e.g. ``"output/skills/demo/SKILL.md"``).
content: Full text content to write (overwrites if file exists).
"""
return write_kb_file(path, content, kb_root)

return base.clone(tools=[*base.tools, write_file])


async def run_query(
question: str,
kb_dir: Path,
Expand Down
Loading