diff --git a/docs/README.skills.md b/docs/README.skills.md
index 830360abb..7f3362c1e 100644
--- a/docs/README.skills.md
+++ b/docs/README.skills.md
@@ -225,6 +225,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to
| [microsoft-skill-creator](../skills/microsoft-skill-creator/SKILL.md)
`gh skills install github/awesome-copilot microsoft-skill-creator` | Create agent skills for Microsoft technologies using Learn MCP tools. Use when users want to create a skill that teaches agents about any Microsoft technology, library, framework, or service (Azure, .NET, M365, VS Code, Bicep, etc.). Investigates topics deeply, then generates a hybrid skill storing essential knowledge locally while enabling dynamic deeper investigation. | `references/skill-templates.md` |
| [migrating-oracle-to-postgres-stored-procedures](../skills/migrating-oracle-to-postgres-stored-procedures/SKILL.md)
`gh skills install github/awesome-copilot migrating-oracle-to-postgres-stored-procedures` | Migrates Oracle PL/SQL stored procedures to PostgreSQL PL/pgSQL. Translates Oracle-specific syntax, preserves method signatures and type-anchored parameters, leverages orafce where appropriate, and applies COLLATE "C" for Oracle-compatible text sorting. Use when converting Oracle stored procedures or functions to PostgreSQL equivalents during a database migration. | None |
| [minecraft-plugin-development](../skills/minecraft-plugin-development/SKILL.md)
`gh skills install github/awesome-copilot minecraft-plugin-development` | Use this skill when building or modifying Minecraft server plugins for Paper, Spigot, or Bukkit, including plugin.yml setup, commands, listeners, schedulers, player state, team or arena systems, persistent progression, economy or profile data, configuration files, Adventure text, and version-safe API usage. Trigger for requests like "build a Minecraft plugin", "add a Paper command", "fix a Bukkit listener", "create plugin.yml", "implement a minigame mechanic", "add a perk or quest system", or "debug server plugin behavior". | `references/bootstrap-registration.md`
`references/build-test-and-runtime-validation.md`
`references/config-data-and-async.md`
`references/maps-heroes-and-feature-modules.md`
`references/minigame-instance-flow.md`
`references/persistent-progression-and-events.md`
`references/project-patterns.md`
`references/state-sessions-and-phases.md` |
+| [mini-context-graph](../skills/mini-context-graph/SKILL.md)
`gh skills install github/awesome-copilot mini-context-graph` | A persistent, compounding knowledge base combining Karpathy's LLM Wiki pattern
with a structured knowledge graph. Ingest documents once — the LLM writes wiki
pages, extracts entities/relations into the graph, and stores raw content for
evidence retrieval. Knowledge accumulates and cross-references; it is never
re-derived from scratch. | `references/ingestion.md`
`references/lint.md`
`references/ontology.md`
`references/retrieval.md`
`scripts/config.py`
`scripts/contextgraph.py`
`scripts/template_agent_workflow.py`
`scripts/tools`
`skill.md` |
| [mkdocs-translations](../skills/mkdocs-translations/SKILL.md)
`gh skills install github/awesome-copilot mkdocs-translations` | Generate a language translation for a mkdocs documentation stack. | None |
| [model-recommendation](../skills/model-recommendation/SKILL.md)
`gh skills install github/awesome-copilot model-recommendation` | Analyze chatmode or prompt files and recommend optimal AI models based on task complexity, required capabilities, and cost-efficiency | None |
| [msstore-cli](../skills/msstore-cli/SKILL.md)
`gh skills install github/awesome-copilot msstore-cli` | Microsoft Store Developer CLI (msstore) for publishing Windows applications to the Microsoft Store. Use when asked to configure Store credentials, list Store apps, check submission status, publish submissions, manage package flights, set up CI/CD for Store publishing, or integrate with Partner Center. Supports Windows App SDK/WinUI, UWP, .NET MAUI, Flutter, Electron, React Native, and PWA applications. | None |
diff --git a/skills/mini-context-graph/references/ingestion.md b/skills/mini-context-graph/references/ingestion.md
new file mode 100644
index 000000000..28ccbbe23
--- /dev/null
+++ b/skills/mini-context-graph/references/ingestion.md
@@ -0,0 +1,196 @@
+# Ingestion Instructions
+
+This file defines how the agent extracts entities and relations from a raw document.
+
+---
+
+## Step 1: Read the Document
+
+Read the provided text carefully. Identify:
+- **Entities**: noun phrases that refer to real-world objects, systems, components, actors, concepts, or events.
+- **Relations**: verb phrases that describe how one entity affects, contains, causes, uses, or is related to another.
+
+---
+
+## Step 2: Extract Entities
+
+For each entity:
+- Record its **name** (normalized: lowercase, strip leading/trailing whitespace)
+- Assign a **type**: a short label (1–3 words) that categorizes the entity
+
+### Entity Type Examples
+
+| Entity Name | Suggested Type |
+|-------------|---------------|
+| Python interpreter | software |
+| memory leak | issue |
+| operating system | system |
+| database | infrastructure |
+| user | actor |
+| API endpoint | interface |
+| server | infrastructure |
+
+**Rules:**
+- Types must be general enough to reuse across documents
+- Do NOT create unique types per entity (e.g., avoid `python-interpreter-type`)
+- Use `ontology.md` normalization rules to canonicalize types
+
+---
+
+## Step 3: Extract Relations
+
+For each pair of entities with an explicit connection in the text:
+- Record the **source** entity name
+- Record the **target** entity name
+- Record the **relation type**: a verb or verb phrase (normalized: lowercase)
+- Assign a **confidence** score between 0 and 1:
+ - 1.0 = stated explicitly ("A causes B")
+ - 0.8 = strongly implied ("A is linked to B")
+ - 0.6 = weakly implied ("A may affect B")
+ - < 0.6 = do NOT include
+
+---
+
+## Step 4: Output Format
+
+Produce a JSON object in this exact format:
+
+```json
+{
+ "entities": [
+ { "name": "entity name", "type": "entity type", "supporting_text": "exact quote mentioning this entity" }
+ ],
+ "relations": [
+ {
+ "source": "source entity name",
+ "target": "target entity name",
+ "type": "relation type",
+ "confidence": 0.9,
+ "supporting_text": "exact quote that justifies this relation"
+ }
+ ]
+}
+```
+
+The `supporting_text` field is **required for provenance**. It must be a verbatim or near-verbatim quote from the document that mentions or supports the entity/relation. This is what links graph nodes and edges back to their source.
+
+---
+
+## Rules
+
+- All names and types must be **lowercase**
+- Only include relations where **both entities** are present in the entities list
+- Do NOT invent entities or relations not supported by the text
+- Prefer **reusing existing entity and relation types** from the ontology over creating new ones
+- One entity can appear in multiple relations (as source or target)
+- Always include `supporting_text` — this enables evidence retrieval and audit trails
+
+---
+
+## Step 5: Write Wiki Pages (Required)
+
+After calling `skill.ingest_with_content(...)`, you MUST write wiki pages:
+
+### 5a. Write a summary page for the document
+
+```python
+from scripts.tools import wiki_store
+
+wiki_store.write_page(
+ category="summary",
+ title=f"{title} Summary",
+ content=f"""---
+title: {title}
+source_document: {doc_id}
+tags: [summary]
+---
+
+# {title}
+
+**Source:** {source}
+
+## Key Claims
+
+{chr(10).join(f'- [[{r["source"].replace(" ", "-")}]] {r["type"]} [[{r["target"].replace(" ", "-")}]] (confidence: {r["confidence"]})' for r in relations)}
+
+## Entities
+
+{chr(10).join(f'- [[{e["name"].replace(" ", "-")}]] ({e["type"]})' for e in entities)}
+
+## Open Questions
+
+- (Add questions from reading the document here)
+""",
+ summary=f"Summary of {title}",
+)
+```
+
+### 5b. Write or update entity pages
+
+For each **new** entity not already in the wiki, write an entity page:
+
+```python
+wiki_store.write_page(
+ category="entity",
+ title=entity_name,
+ content=f"""---
+title: {entity_name}
+type: {entity_type}
+source_document: {doc_id}
+tags: [{entity_type}]
+---
+
+# {entity_name}
+
+(Description from the document or prior knowledge.)
+
+## Relations
+
+(List any wikilinks to related entities extracted from relations.)
+
+## Mentioned in
+
+- [[{doc_id}-summary]]
+""",
+ summary=f"{entity_name}: {entity_type}",
+)
+```
+
+For **existing** entity pages, read the current page and append new information, updated relations, or flag contradictions.
+
+---
+
+## Example
+
+**Input document:**
+```
+System crashes due to memory leaks.
+Memory leaks occur when objects are not released.
+```
+
+**Expected extraction output:**
+```json
+{
+ "entities": [
+ { "name": "system crash", "type": "issue", "supporting_text": "system crashes due to memory leaks" },
+ { "name": "memory leak", "type": "issue", "supporting_text": "memory leaks occur when objects are not released" },
+ { "name": "object", "type": "component", "supporting_text": "objects are not released" }
+ ],
+ "relations": [
+ {
+ "source": "memory leak",
+ "target": "system crash",
+ "type": "causes",
+ "confidence": 1.0,
+ "supporting_text": "System crashes due to memory leaks."
+ },
+ {
+ "source": "object",
+ "target": "memory leak",
+ "type": "contributes to",
+ "confidence": 0.9,
+ "supporting_text": "Memory leaks occur when objects are not released."
+ }
+ ]
+}
+```
diff --git a/skills/mini-context-graph/references/lint.md b/skills/mini-context-graph/references/lint.md
new file mode 100644
index 000000000..593f2fd60
--- /dev/null
+++ b/skills/mini-context-graph/references/lint.md
@@ -0,0 +1,163 @@
+# Lint Instructions
+
+This file defines the wiki health-check workflow.
+
+Run this periodically (or after a large batch of ingests) to keep the wiki
+clean and accurate. The pattern is from Karpathy's LLM Wiki: detect contradictions,
+orphans, broken links, stale claims, and data gaps.
+
+---
+
+## When to Run
+
+- After ingesting 5+ documents
+- When the user asks "check the wiki" or "health check"
+- When answers seem inconsistent or contradictory
+- Before a major synthesis or presentation
+
+---
+
+## Step 1: Run the Automated Health Check
+
+```python
+from scripts.tools import wiki_store
+
+issues = wiki_store.lint_wiki()
+# Returns:
+# {
+# "orphan_pages": [list of slugs in files but not in index],
+# "missing_pages": [list of slugs in index but file deleted],
+# "broken_wikilinks": {slug: [broken link targets]},
+# "isolated_pages": [slugs with no wikilinks at all],
+# }
+```
+
+---
+
+## Step 2: Triage Each Issue Type
+
+### Orphan Pages
+Pages exist on disk but are not in the index. They are invisible to search.
+**Fix**: Add them to the index or delete if stale.
+
+```python
+# To add to index, re-write the page (this auto-updates the index):
+wiki_store.write_page(category="...", title="...", content=existing_content)
+
+# To delete (manual step — confirm with user first):
+# rm wiki/{category}/{slug}.md
+```
+
+### Missing Pages
+In the index but the file was deleted. Dangling references.
+**Fix**: Either recreate the page from knowledge or remove from index.
+
+### Broken Wikilinks
+`[[slug]]` references that point to pages that don't exist.
+**Fix**: Create the missing page, or correct the link.
+
+### Isolated Pages
+Pages with no `[[wikilinks]]` — they are unreachable via link traversal.
+**Fix**: Add links from/to related pages.
+
+---
+
+## Step 3: Check for Contradictions
+
+Read the wiki index and scan for pages that might contradict each other:
+
+```python
+pages = wiki_store.list_pages()
+# Returns [{slug, category, summary, date}, ...]
+```
+
+Look for:
+- Same entity with conflicting `type` in different pages
+- Same relation with different direction in different pages
+- Newer ingests that update/supersede older claims
+
+**When you find a contradiction:**
+- Add a `## Contradictions` section to the relevant entity/topic pages:
+ ```markdown
+ ## Contradictions
+ - doc_001 says X; doc_003 says not-X — unresolved
+ ```
+- Flag it in the log:
+ ```python
+ # Handled by wiki_store.write_page which auto-appends to log.md
+ ```
+
+---
+
+## Step 4: Check for Stale Claims
+
+Review pages ingested more than N days ago (use the `date` field from the index).
+Ask: "Has any newer document superseded this claim?"
+
+**When a claim is stale:**
+- Update the page: add a `## Superseded` section or update the body.
+- Mark the old claim with _(superseded by [[newer-doc-summary]])_.
+
+---
+
+## Step 5: Check for Missing Cross-References
+
+For each entity page, check: does it link back to all summary pages that mention it?
+For each summary page, check: does it link to all entity pages it extracted?
+
+**Fix**: Read the page and add missing `[[slug]]` links.
+
+---
+
+## Step 6: Identify Data Gaps
+
+Review entity pages that lack:
+- A proper description (just a stub)
+- Any `## Relations` section
+- Any `## Mentioned in` links
+
+These are candidates for deeper research or new ingests.
+
+---
+
+## Step 7: Log the Lint Pass
+
+```python
+# wiki_store.write_page automatically logs the activity.
+# For a manual lint summary, append to log.md via write_page on a topic:
+wiki_store.write_page(
+ category="topic",
+ title="Lint Pass YYYY-MM-DD",
+ content="# Lint Pass\n\n## Issues Found\n\n...\n\n## Fixed\n\n...",
+ summary="Lint pass results",
+)
+```
+
+---
+
+## Quick Lint Commands
+
+```python
+from scripts.tools import wiki_store
+
+# Full health check
+issues = wiki_store.lint_wiki()
+
+# Get recent history
+log = wiki_store.get_log(last_n=10)
+
+# List all pages
+all_pages = wiki_store.list_pages()
+
+# Search for a concept across wiki
+results = wiki_store.search_wiki("memory leak")
+```
+
+---
+
+## Rules
+
+- NEVER delete pages without user confirmation
+- NEVER auto-resolve a contradiction — flag it for human review
+- File all lint results as a topic page in the wiki (so the history is visible)
+- Prefer adding cross-references over rewriting existing content
diff --git a/skills/mini-context-graph/references/ontology.md b/skills/mini-context-graph/references/ontology.md
new file mode 100644
index 000000000..0b0574c04
--- /dev/null
+++ b/skills/mini-context-graph/references/ontology.md
@@ -0,0 +1,99 @@
+# Ontology Instructions
+
+This file defines the rules for maintaining and evolving the dynamic ontology used by the Context Graph.
+
+---
+
+## Core Principle
+
+The ontology is **NOT fixed**. Types and relations emerge from documents as they are ingested.
+However, the ontology must remain **compact, consistent, and reusable**.
+
+---
+
+## Entity Type Rules
+
+### Normalization
+
+When assigning an entity type, apply these transformations:
+1. Convert to **lowercase**
+2. Strip leading/trailing whitespace
+3. Replace underscores and hyphens with spaces
+4. Merge synonymous types using the mapping table below
+
+### Synonym Mapping (Entity Types)
+
+| Variant | Canonical Type |
+|---------|---------------|
+| component, module, class, function | component |
+| bug, defect, fault, error, failure | issue |
+| server, host, machine, node | infrastructure |
+| user, person, operator, admin, actor | actor |
+| app, application, service, program, software | software |
+| database, datastore, db, storage | storage |
+| api, endpoint, interface, connection | interface |
+| event, incident, occurrence, trigger | event |
+| concept, idea, principle, theory | concept |
+| process, thread, task, job, workflow | process |
+
+### Adding New Types
+
+If an entity does not match any existing type:
+- Create a **new type** if it is genuinely distinct
+- Keep the label short (1–3 words, lowercase)
+- Consider whether an existing type is close enough before creating a new one
+
+### Constraint
+
+- Maximum ~50 distinct entity types across the entire ontology
+- If the limit is approached, merge similar types rather than creating new ones
+
+---
+
+## Relation Type Rules
+
+### Normalization
+
+When assigning a relation type:
+1. Convert to **lowercase**
+2. Strip whitespace
+3. Use verb phrases in **present tense** (e.g., "causes", "contains", "uses")
+4. Merge synonyms using the mapping table below
+
+### Synonym Mapping (Relation Types)
+
+| Variant | Canonical Relation |
+|---------|-------------------|
+| triggers, leads to, results in, produces | causes |
+| is part of, belongs to, lives in, sits in | contains |
+| depends on, requires, needs | depends on |
+| uses, calls, invokes, consumes | uses |
+| affects, impacts, influences | affects |
+| creates, instantiates, spawns | creates |
+| connects to, links to, references | connects to |
+| inherits from, extends, subclasses | extends |
+| reads from, queries, fetches | reads from |
+| writes to, stores in, persists to | writes to |
+
+### Adding New Relations
+
+- Only add new relation types if no existing type accurately describes the relationship
+- Prefer canonical relations over creating new ones
+
+---
+
+## Ontology Update Protocol
+
+When processing extracted entities/relations from `ingestion.md`:
+
+1. For each entity type:
+ - Run through the synonym mapping
+ - Call `ontology_store.normalize_type(type_name)` to get the canonical form
+ - Call `ontology_store.add_type(canonical_type)` to register it
+
+2. For each relation type:
+ - Run through the synonym mapping
+ - Call `ontology_store.normalize_relation(relation_name)` to get the canonical form
+ - Call `ontology_store.add_relation(canonical_relation)` to register it
+
+3. Use the **canonical** type/relation names when creating nodes and edges in the graph.
diff --git a/skills/mini-context-graph/references/retrieval.md b/skills/mini-context-graph/references/retrieval.md
new file mode 100644
index 000000000..4213df981
--- /dev/null
+++ b/skills/mini-context-graph/references/retrieval.md
@@ -0,0 +1,163 @@
+# Retrieval Instructions
+
+This file defines how the agent answers queries using the two-layer retrieval strategy:
+**wiki-first** (fast path), then **graph traversal with evidence** (deep path).
+
+---
+
+## Overview
+
+Retrieval is a 7-step process:
+
+1. Parse the query
+2. **Check the wiki first** (fast path)
+3. Find seed nodes in the graph
+4. Expand the graph via BFS
+5. Prune noisy nodes
+6. Build the subgraph with provenance
+7. Return structured context
+
+---
+
+## Step 1: Parse the Query
+
+Read the query string and identify:
+- **Key noun phrases**: potential entity names (e.g., "system crash", "memory leak")
+- **Keywords**: individual meaningful words (e.g., "crash", "leak", "memory")
+- Normalize all terms to **lowercase**
+
+Ignore stopwords (e.g., "the", "a", "is", "why", "does", "how", "what").
+
+---
+
+## Step 2: Check the Wiki First (Fast Path)
+
+Before touching the graph, search the wiki. The wiki contains compiled knowledge —
+cross-references already resolved, contradictions flagged, syntheses written.
+
+```python
+from scripts.tools import wiki_store
+
+results = wiki_store.search_wiki(query)
+```
+
+For each relevant result, read the page:
+
+```python
+content = wiki_store.read_page_by_slug(result["slug"])
+```
+
+**If the wiki has a sufficient answer:**
+- Synthesize from wiki pages.
+- Cite the source pages (e.g., "According to [[memory-leak]] and [[system-crash]]...").
+- File the answer as a new wiki topic page if it's valuable and not already captured:
+ ```python
+ wiki_store.write_page(category="topic", title="Why System Crashes", content=..., summary=...)
+ ```
+- **Return early** — no graph traversal needed.
+
+**If the wiki answer is incomplete or missing:** proceed to Step 3.
+
+---
+
+## Step 3: Find Seed Nodes
+
+Call `index_store.search(query)` with the original query string.
+
+This returns node IDs matching entity names or keywords.
+
+If no seed nodes are found:
+- Try searching with individual keywords from Step 1.
+- If still no results, return an empty subgraph: "No relevant entities found."
+
+---
+
+## Step 4: Expand the Graph (BFS)
+
+Call `retrieval_engine.retrieve(seed_node_ids, depth=2)`.
+
+BFS from seed nodes:
+- **Depth 1**: direct neighbors
+- **Depth 2**: neighbors of neighbors
+
+Rules:
+- Only traverse edges with confidence ≥ MIN_CONFIDENCE (from config.py)
+- Do NOT traverse beyond depth 2
+- Collect all visited node IDs
+
+---
+
+## Step 5: Prune Nodes
+
+- Limit total nodes to MAX_NODES (from config.py)
+- Prioritize:
+ 1. Seed nodes (always include)
+ 2. Nodes at depth 1
+ 3. Nodes at depth 2 (as space allows)
+- Remove nodes only weakly connected (edge confidence < MIN_CONFIDENCE)
+
+---
+
+## Step 6: Build the Subgraph with Provenance
+
+For a standard query, call:
+
+```python
+subgraph = skill.query(query)
+# Returns: {"nodes": {node_id: {name, type, source_document, source_chunks}},
+# "edges": [{source, target, type, confidence, source_document, supporting_text, chunk_id}]}
+```
+
+For queries requiring evidence (citations, fact-checking), call:
+
+```python
+result = skill.query_with_evidence(query)
+# Returns:
+# {
+# "query": str,
+# "subgraph": {"nodes": {...}, "edges": [...]},
+# "supporting_documents": [
+# {
+# "doc_id": str,
+# "doc_title": str,
+# "supporting_chunks": [{"chunk_id": str, "text": str}, ...]
+# }
+# ],
+# "evidence_chain": "memory leak --[causes]--> system crash"
+# }
+```
+
+---
+
+## Step 7: Return Structured Context
+
+Return the result with:
+- **Subgraph**: nodes + edges (the graph answer)
+- **Supporting documents**: source chunks that prove each relation
+- **Evidence chain**: human-readable path summary
+- **Wiki references**: links to relevant wiki pages found in Step 2
+
+**If valuable, file the answer back into the wiki:**
+
+```python
+wiki_store.write_page(
+ category="topic",
+ title=query,
+ content=f"# {query}\n\n**Evidence chain:** {result['evidence_chain']}\n\n...",
+ summary="...",
+)
+```
+
+This way, future queries on the same topic find the answer instantly in the wiki.
+
+---
+
+## Rules
+
+- NEVER fabricate nodes or edges not present in the graph
+- NEVER traverse deeper than depth 2
+- ALWAYS check the wiki before the graph (wiki-first)
+- Always include seed nodes in the result, even if they have no edges
+- Prefer edges with higher confidence when pruning
+- File valuable answers back into the wiki as topic pages
+- Return an empty subgraph (not an error) if no relevant nodes are found
diff --git a/skills/mini-context-graph/scripts/config.py b/skills/mini-context-graph/scripts/config.py
new file mode 100644
index 000000000..5c246c8f9
--- /dev/null
+++ b/skills/mini-context-graph/scripts/config.py
@@ -0,0 +1,23 @@
+"""
+config.py — Global configuration constants for the Context Graph Skill.
+
+Data directories are resolved from environment variables so the skill can be
+used from any project without writing data inside the skill package itself.
+
+ MINI_CONTEXT_GRAPH_DATA_DIR — where graph.json, index.json, etc. live
+ MINI_CONTEXT_GRAPH_WIKI_DIR — where wiki pages, index.md, and log.md live
+
+Both default to subdirectories of the current working directory when the env
+vars are not set, so data ends up in the consuming project's directory.
+"""
+
+import os
+from pathlib import Path
+
+_BASE = Path(os.environ.get("MINI_CONTEXT_GRAPH_BASE", str(Path.cwd())))
+DATA_DIR = Path(os.environ.get("MINI_CONTEXT_GRAPH_DATA_DIR", str(_BASE / "data")))
+WIKI_DIR = Path(os.environ.get("MINI_CONTEXT_GRAPH_WIKI_DIR", str(_BASE / "wiki")))
+
+MAX_GRAPH_DEPTH: int = 2
+MIN_CONFIDENCE: float = 0.6
+MAX_NODES: int = 50
diff --git a/skills/mini-context-graph/scripts/contextgraph.py b/skills/mini-context-graph/scripts/contextgraph.py
new file mode 100644
index 000000000..00c716546
--- /dev/null
+++ b/skills/mini-context-graph/scripts/contextgraph.py
@@ -0,0 +1,296 @@
+"""
+contextgraph.py — Main interface for the Context Graph Skill.
+
+This file is orchestration-only. All LLM reasoning lives in the .md files.
+Python here only wires together the deterministic storage and retrieval tools.
+
+Agent usage:
+- ingest(): agent reads ingestion.md + ontology.md, extracts entities/relations,
+ then calls the tool methods directly.
+- query(): agent reads retrieval.md, calls index_store.search + retrieval_engine.retrieve,
+ then calls graph_store.get_subgraph and returns the result.
+"""
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent))
+
+import config
+from tools import graph_store, index_store, ontology_store, retrieval_engine, documents_store
+
+
+class ContextGraphSkill:
+
+ def ingest(self, documents: list[str]) -> None:
+ """
+ Orchestration entry point for ingesting documents into the context graph.
+
+ The agent (Copilot) MUST:
+ 1. Read ingestion.md to understand entity/relation extraction rules.
+ 2. Read ontology.md to apply type normalization.
+ 3. For each document, produce a JSON with entities + relations.
+ 4. For each entity:
+ - ontology_store.add_type(entity["type"])
+ - node_id = graph_store.add_node(entity["name"], entity["type"])
+ - index_store.add_entity(entity["name"], node_id)
+ 5. For each relation (if confidence >= MIN_CONFIDENCE):
+ - ontology_store.add_relation(relation["type"])
+ - source_id = graph_store.find_node_by_name(relation["source"])
+ - target_id = graph_store.find_node_by_name(relation["target"])
+ - graph_store.add_edge(source_id, target_id, relation["type"], relation["confidence"])
+
+ This method does NOT call any LLM. It documents the agent contract only.
+ """
+ raise NotImplementedError(
+ "ingest() must be driven by the Copilot agent following ingestion.md. "
+ "Call the tool methods directly after LLM extraction."
+ )
+
+ def query(self, query: str) -> dict:
+ """
+ Orchestration entry point for retrieving a subgraph for a query.
+
+ The agent (Copilot) MUST:
+ 1. Read retrieval.md to understand the retrieval strategy.
+ 2. Call index_store.search(query) to get seed node_ids.
+ 3. Call retrieval_engine.retrieve(seed_ids, depth=MAX_GRAPH_DEPTH) to expand.
+ 4. Call graph_store.get_subgraph(node_ids) to build the result.
+ 5. Return the subgraph dict.
+
+ This method does NOT call any LLM. It documents the agent contract only.
+ Returns an empty subgraph if called directly.
+ """
+ seed_ids = index_store.search(query)
+ if not seed_ids:
+ return {"nodes": {}, "edges": []}
+
+ node_ids = retrieval_engine.retrieve(
+ seed_ids,
+ depth=config.MAX_GRAPH_DEPTH,
+ min_confidence=config.MIN_CONFIDENCE,
+ max_nodes=config.MAX_NODES,
+ )
+ return graph_store.get_subgraph(node_ids)
+
+ # ------------------------------------------------------------------
+ # Convenience wrappers — agents may call these directly
+ # ------------------------------------------------------------------
+
+ def add_node(self, name: str, node_type: str) -> str:
+ """Add a node to the graph and index. Returns node_id."""
+ canonical_type = ontology_store.normalize_type(node_type)
+ ontology_store.add_type(canonical_type)
+ node_id = graph_store.add_node(name, canonical_type)
+ index_store.add_entity(name, node_id)
+ return node_id
+
+ def add_edge(
+ self, source_name: str, target_name: str, relation: str, confidence: float
+ ) -> None:
+ """Add an edge between two nodes (by name) if both exist and confidence qualifies."""
+ if confidence < config.MIN_CONFIDENCE:
+ return
+
+ source_id = graph_store.find_node_by_name(source_name)
+ target_id = graph_store.find_node_by_name(target_name)
+ if source_id is None or target_id is None:
+ return
+
+ canonical_relation = ontology_store.normalize_relation(relation)
+ ontology_store.add_relation(canonical_relation)
+ graph_store.add_edge(source_id, target_id, canonical_relation, confidence)
+
+ # ------------------------------------------------------------------
+ # LLM Wiki + RAG methods — store raw content & provenance
+ # ------------------------------------------------------------------
+
+ def ingest_with_content(
+ self,
+ doc_id: str,
+ title: str,
+ source: str,
+ raw_content: str,
+ entities: list[dict],
+ relations: list[dict],
+ ) -> dict:
+ """
+ Full RAG ingestion: stores raw document + chunks, then wires provenance
+ links from each graph node/edge back to source chunks.
+
+ The agent MUST:
+ 1. Read the raw_content.
+ 2. Read ingestion.md and ontology.md for extraction rules.
+ 3. Extract entities and relations (LLM reasoning step).
+ 4. Call this method with the results.
+
+ Args:
+ doc_id: Stable document identifier (e.g. "doc_001").
+ title: Human-readable document title.
+ source: Origin path or URL (immutable, never modified).
+ raw_content: Full text of the document.
+ entities: List of dicts: [{name, type, supporting_text?}, ...]
+ relations: List of dicts: [{source, target, type, confidence,
+ supporting_text?, chunk_hint?}, ...]
+
+ Returns:
+ Summary dict: {doc_id, chunk_count, nodes_added, edges_added}
+ """
+ # Step 1: Store raw document and auto-chunk
+ doc = documents_store.add_document(doc_id, title, source, raw_content)
+ chunks = doc["chunks"]
+
+ def _find_best_chunk(text: str) -> str | None:
+ """Find the chunk whose text most overlaps with the given span."""
+ if not text or not chunks:
+ return None
+ text_lower = text.lower()
+ best_chunk_id = None
+ best_score = 0
+ for chunk in chunks:
+ if text_lower in chunk["text"].lower():
+ return chunk["chunk_id"]
+ # Fallback: count overlapping words
+ words_text = set(text_lower.split())
+ words_chunk = set(chunk["text"].lower().split())
+ score = len(words_text & words_chunk)
+ if score > best_score:
+ best_score = score
+ best_chunk_id = chunk["chunk_id"]
+ return best_chunk_id
+
+ nodes_added = 0
+ # Step 2: Ingest entities with provenance
+ for entity in entities:
+ supporting = entity.get("supporting_text", "")
+ chunk_id = _find_best_chunk(supporting)
+ chunk_ids = [chunk_id] if chunk_id else []
+
+ canonical_type = ontology_store.normalize_type(entity["type"])
+ ontology_store.add_type(canonical_type)
+ node_id = graph_store.add_node(
+ entity["name"],
+ canonical_type,
+ source_document=doc_id,
+ source_chunks=chunk_ids,
+ )
+ index_store.add_entity(entity["name"], node_id)
+ nodes_added += 1
+
+ edges_added = 0
+ # Step 3: Ingest relations with provenance
+ for rel in relations:
+ if rel.get("confidence", 0) < config.MIN_CONFIDENCE:
+ continue
+
+ supporting = rel.get("supporting_text", "")
+ chunk_id = _find_best_chunk(supporting) or rel.get("chunk_hint")
+
+ source_id = graph_store.find_node_by_name(rel["source"])
+ target_id = graph_store.find_node_by_name(rel["target"])
+ if source_id is None or target_id is None:
+ continue
+
+ canonical_relation = ontology_store.normalize_relation(rel["type"])
+ ontology_store.add_relation(canonical_relation)
+ graph_store.add_edge(
+ source_id,
+ target_id,
+ canonical_relation,
+ rel["confidence"],
+ source_document=doc_id,
+ supporting_text=supporting or None,
+ chunk_id=chunk_id,
+ )
+ edges_added += 1
+
+ return {
+ "doc_id": doc_id,
+ "chunk_count": len(chunks),
+ "nodes_added": nodes_added,
+ "edges_added": edges_added,
+ }
+
+ def query_with_evidence(self, query: str) -> dict:
+ """
+ Query the graph and return the subgraph together with supporting
+ source documents and chunks (evidence chain).
+
+ Returns:
+ {
+ "query": str,
+ "subgraph": {"nodes": {...}, "edges": [...]},
+ "supporting_documents": [
+ {
+ "doc_id": str,
+ "doc_title": str,
+ "supporting_chunks": [{"chunk_id": str, "text": str}, ...]
+ }
+ ],
+ "evidence_chain": str # human-readable summary path
+ }
+ """
+ subgraph = self.query(query)
+ if not subgraph["nodes"]:
+ return {
+ "query": query,
+ "subgraph": subgraph,
+ "supporting_documents": [],
+ "evidence_chain": "No matching nodes found.",
+ }
+
+ # Collect all provenance pointers from nodes and edges
+ docs_chunks: dict[str, list[str]] = {} # doc_id -> [chunk_ids]
+
+ for node in subgraph["nodes"].values():
+ doc_id = node.get("source_document")
+ if doc_id:
+ docs_chunks.setdefault(doc_id, [])
+ docs_chunks[doc_id].extend(node.get("source_chunks") or [])
+
+ for edge in subgraph["edges"]:
+ doc_id = edge.get("source_document")
+ if doc_id:
+ docs_chunks.setdefault(doc_id, [])
+ if edge.get("chunk_id"):
+ docs_chunks[doc_id].append(edge["chunk_id"])
+
+ # Resolve chunk texts from documents_store
+ supporting_documents = []
+ for doc_id, chunk_ids in docs_chunks.items():
+ doc = documents_store.get_document(doc_id)
+ if doc is None:
+ continue
+ seen = set()
+ chunks_out = []
+ for cid in chunk_ids:
+ if cid in seen:
+ continue
+ seen.add(cid)
+ chunk = documents_store.get_chunk(cid)
+ if chunk:
+ chunks_out.append({"chunk_id": cid, "text": chunk["text"]})
+ if chunks_out:
+ supporting_documents.append({
+ "doc_id": doc_id,
+ "doc_title": doc["title"],
+ "supporting_chunks": chunks_out,
+ })
+
+ # Build a simple evidence chain string
+ chain_parts = []
+ for edge in subgraph["edges"]:
+ src_node = subgraph["nodes"].get(edge["source"], {})
+ tgt_node = subgraph["nodes"].get(edge["target"], {})
+ src_name = src_node.get("name", edge["source"])
+ tgt_name = tgt_node.get("name", edge["target"])
+ chain_parts.append(f"{src_name} --[{edge['type']}]--> {tgt_name}")
+ evidence_chain = " | ".join(chain_parts) if chain_parts else "No edges in subgraph."
+
+ return {
+ "query": query,
+ "subgraph": subgraph,
+ "supporting_documents": supporting_documents,
+ "evidence_chain": evidence_chain,
+ }
diff --git a/skills/mini-context-graph/scripts/template_agent_workflow.py b/skills/mini-context-graph/scripts/template_agent_workflow.py
new file mode 100644
index 000000000..5dd511c16
--- /dev/null
+++ b/skills/mini-context-graph/scripts/template_agent_workflow.py
@@ -0,0 +1,198 @@
+"""
+template_agent_workflow.py — Template agent script for ingesting + querying the context graph.
+
+This script demonstrates the complete workflow an agent should follow:
+1. Read markdown guidance files
+2. Extract entities/relations via LLM reasoning
+3. Call Python methods to persist
+4. Query the graph
+5. Handle errors gracefully
+
+Copy and adapt this template for your agent implementation.
+"""
+
+import json
+import sys
+from pathlib import Path
+
+# Add tools to path
+sys.path.insert(0, str(Path(__file__).parent))
+
+from contextgraph import ContextGraphSkill
+
+
+def ingest_document(skill: ContextGraphSkill, document: str) -> dict:
+ """
+ Step 1: Agent reads ingestion.md and ontology.md
+ Step 2: Agent uses LLM to extract entities and relations
+ Step 3: Call Python methods to persist (mimicked here with static extraction)
+
+ In a real agent, replace the static extraction with LLM calls.
+ """
+ print(f"\n[INGEST] Processing document:\n{document}\n")
+
+ # --- STEP 1 & 2: LLM EXTRACTION PHASE (Guided by ingestion.md + ontology.md) ---
+ # In a real agent, this would use LLM reasoning.
+ # For now, we'll mock an extraction result:
+
+ extraction_result = {
+ "entities": [
+ {"name": "memory leak", "type": "issue"},
+ {"name": "system crash", "type": "issue"},
+ {"name": "object", "type": "component"},
+ ],
+ "relations": [
+ {
+ "source": "memory leak",
+ "target": "system crash",
+ "type": "causes",
+ "confidence": 1.0,
+ },
+ {
+ "source": "object",
+ "target": "memory leak",
+ "type": "contributes to",
+ "confidence": 0.9,
+ },
+ ],
+ }
+
+ print(f"[LLM] Extracted entities + relations:")
+ print(json.dumps(extraction_result, indent=2))
+
+ # --- STEP 3: PERSIST PHASE (Call Python methods) ---
+ errors = []
+ added_nodes = {}
+
+ for entity in extraction_result["entities"]:
+ try:
+ node_id = skill.add_node(entity["name"], entity["type"])
+ added_nodes[entity["name"]] = node_id
+ print(f" ✓ Added node: {entity['name']} (id: {node_id}, type: {entity['type']})")
+ except Exception as e:
+ errors.append(f"Failed to add node {entity['name']}: {e}")
+ print(f" ✗ Error adding node {entity['name']}: {e}")
+
+ for relation in extraction_result["relations"]:
+ # Validate both endpoints exist
+ if relation["source"] not in added_nodes or relation["target"] not in added_nodes:
+ error_msg = f"Cannot add edge: source or target missing"
+ errors.append(error_msg)
+ print(f" ✗ Skip edge {relation['source']} → {relation['target']}: {error_msg}")
+ continue
+
+ # Validate confidence threshold
+ if relation["confidence"] < 0.6:
+ error_msg = f"Confidence {relation['confidence']} < 0.6 (minimum threshold)"
+ errors.append(error_msg)
+ print(f" ✗ Skip edge {relation['source']} → {relation['target']}: {error_msg}")
+ continue
+
+ try:
+ skill.add_edge(
+ source_name=relation["source"],
+ target_name=relation["target"],
+ relation=relation["type"],
+ confidence=relation["confidence"],
+ )
+ print(
+ f" ✓ Added edge: {relation['source']} "
+ f"--[{relation['type']}]→ {relation['target']} "
+ f"(confidence: {relation['confidence']})"
+ )
+ except Exception as e:
+ errors.append(f"Failed to add edge {relation['source']} → {relation['target']}: {e}")
+ print(f" ✗ Error adding edge: {e}")
+
+ return {
+ "success": len(errors) == 0,
+ "nodes_added": len(added_nodes),
+ "edges_added": len(extraction_result["relations"]) - len(
+ [e for e in errors if "skip edge" in e.lower()]
+ ),
+ "errors": errors,
+ }
+
+
+def query_graph(skill: ContextGraphSkill, query: str) -> dict:
+ """
+ Query the graph for context to answer the user's question.
+
+ Step 1: Read retrieval.md
+ Step 2: Call skill.query() which internally handles BFS + subgraph extraction
+ Step 3: Return structured context
+ """
+ print(f"\n[QUERY] {query}\n")
+
+ try:
+ subgraph = skill.query(query)
+
+ if not subgraph["nodes"]:
+ print(" ℹ No relevant entities found in graph.")
+ return {
+ "success": True,
+ "query": query,
+ "subgraph": subgraph,
+ "nodes_found": 0,
+ "edges_found": 0,
+ }
+
+ print(f" ✓ Retrieved subgraph with {len(subgraph['nodes'])} nodes, {len(subgraph['edges'])} edges")
+ print(f"\n Nodes:")
+ for node_id, node in subgraph["nodes"].items():
+ print(f" - {node['name']} (type: {node['type']}, id: {node_id})")
+
+ print(f"\n Edges:")
+ for edge in subgraph["edges"]:
+ source_name = subgraph["nodes"][edge["source"]]["name"]
+ target_name = subgraph["nodes"][edge["target"]]["name"]
+ print(
+ f" - {source_name} --[{edge['type']}]→ {target_name} "
+ f"(confidence: {edge['confidence']})"
+ )
+
+ return {
+ "success": True,
+ "query": query,
+ "subgraph": subgraph,
+ "nodes_found": len(subgraph["nodes"]),
+ "edges_found": len(subgraph["edges"]),
+ }
+
+ except Exception as e:
+ error_msg = f"Query failed: {e}"
+ print(f" ✗ {error_msg}")
+ return {"success": False, "query": query, "error": error_msg}
+
+
+def main():
+ """Demo: ingest a document, then query the graph."""
+ skill = ContextGraphSkill()
+
+ # ===== INGESTION =====
+ document = """
+ System crashes due to memory leaks.
+ Memory leaks occur when objects are not released.
+ """
+
+ result = ingest_document(skill, document)
+ print(f"\n[INGEST RESULT] Nodes added: {result['nodes_added']}, " f"Edges added: {result['edges_added']}")
+ if result["errors"]:
+ print(f"Errors: {result['errors']}")
+
+ # ===== RETRIEVAL =====
+ queries = [
+ "Why does the system crash?",
+ "What causes memory leaks?",
+ ]
+
+ for query in queries:
+ result = query_graph(skill, query)
+ if result["success"]:
+ print(f" Nodes found: {result['nodes_found']}, Edges found: {result['edges_found']}")
+ else:
+ print(f" Error: {result['error']}")
+
+
+if __name__ == "__main__":
+ main()
diff --git a/skills/mini-context-graph/scripts/tools/__init__.py b/skills/mini-context-graph/scripts/tools/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/skills/mini-context-graph/scripts/tools/documents_store.py b/skills/mini-context-graph/scripts/tools/documents_store.py
new file mode 100644
index 000000000..af967cd45
--- /dev/null
+++ b/skills/mini-context-graph/scripts/tools/documents_store.py
@@ -0,0 +1,191 @@
+"""
+documents_store.py — Persistent storage for raw documents and chunks (RAG layer).
+
+Inspired by Karpathy's LLM Wiki pattern: raw sources are immutable and stored
+as the ground truth. Chunks are the retrieval unit; provenance links tie graph
+nodes/edges back to specific chunks.
+
+Handles:
+- Storing raw documents with metadata
+- Chunking documents into overlapping text windows
+- Retrieving chunks by id or by keyword search
+- Persisting to data/documents.json
+"""
+from __future__ import annotations
+
+import json
+import os
+import re
+import sys
+import uuid
+from datetime import datetime, timezone
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+import config
+
+_DATA_DIR = Path(os.environ.get("MINI_CONTEXT_GRAPH_DATA_DIR", str(config.DATA_DIR)))
+_DOCS_FILE = _DATA_DIR / "documents.json"
+
+_CHUNK_SIZE = 500 # characters per chunk
+_CHUNK_OVERLAP = 100 # overlap between consecutive chunks
+
+_STOPWORDS = frozenset([
+ "a", "an", "the", "is", "are", "was", "were", "be", "been", "being",
+ "have", "has", "had", "do", "does", "did", "will", "would", "could",
+ "should", "may", "might", "shall", "can", "to", "of", "in", "on",
+ "at", "by", "for", "with", "from", "and", "or", "but", "not", "it",
+ "its", "this", "that", "these", "those", "i", "you", "he", "she",
+ "we", "they", "what", "which", "who", "how", "why", "when", "where",
+])
+
+
+def _load() -> dict:
+ if _DOCS_FILE.exists():
+ with open(_DOCS_FILE, "r") as f:
+ return json.load(f)
+ return {"documents": {}}
+
+
+def _save(store: dict) -> None:
+ _DATA_DIR.mkdir(parents=True, exist_ok=True)
+ with open(_DOCS_FILE, "w") as f:
+ json.dump(store, f, indent=2)
+
+
+def _tokenize(text: str) -> list[str]:
+ tokens = re.findall(r"[a-z0-9]+", text.lower())
+ return [t for t in tokens if t not in _STOPWORDS and len(t) > 1]
+
+
+def _chunk_text(content: str, chunk_size: int = _CHUNK_SIZE, overlap: int = _CHUNK_OVERLAP) -> list[str]:
+ """Split content into overlapping character windows."""
+ chunks = []
+ start = 0
+ while start < len(content):
+ end = start + chunk_size
+ chunks.append(content[start:end].strip())
+ if end >= len(content):
+ break
+ start += chunk_size - overlap
+ return [c for c in chunks if c]
+
+
+# ---------------------------------------------------------------------------
+# Public API
+# ---------------------------------------------------------------------------
+
+def add_document(
+ doc_id: str,
+ title: str,
+ source: str,
+ content: str,
+) -> dict:
+ """
+ Store a raw document and auto-generate chunks.
+
+ Args:
+ doc_id: Caller-supplied stable identifier (e.g. "doc_001" or a filename).
+ title: Human-readable title.
+ source: Origin path/URL (immutable provenance pointer).
+ content: Full raw text to store and chunk.
+
+ Returns:
+ The stored document dict including generated chunk_ids.
+ """
+ store = _load()
+
+ # Idempotent: return existing doc if already stored
+ if doc_id in store["documents"]:
+ return store["documents"][doc_id]
+
+ raw_chunks = _chunk_text(content)
+ chunks = []
+ for i, text in enumerate(raw_chunks):
+ chunks.append({
+ "chunk_id": f"{doc_id}_chunk_{i:03d}",
+ "index": i,
+ "text": text,
+ })
+
+ doc = {
+ "id": doc_id,
+ "title": title,
+ "source": source,
+ "content": content,
+ "chunks": chunks,
+ "ingestion_date": datetime.now(timezone.utc).isoformat(),
+ }
+ store["documents"][doc_id] = doc
+ _save(store)
+ return doc
+
+
+def get_document(doc_id: str) -> dict | None:
+ """Return the full document record or None if not found."""
+ store = _load()
+ return store["documents"].get(doc_id)
+
+
+def get_chunk(chunk_id: str) -> dict | None:
+ """Return a specific chunk by its chunk_id (searches across all documents)."""
+ store = _load()
+ for doc in store["documents"].values():
+ for chunk in doc["chunks"]:
+ if chunk["chunk_id"] == chunk_id:
+ return chunk
+ return None
+
+
+def get_chunks_for_document(doc_id: str) -> list[dict]:
+ """Return all chunks for a document."""
+ doc = get_document(doc_id)
+ if doc is None:
+ return []
+ return doc["chunks"]
+
+
+def search_chunks(query: str, top_k: int = 5) -> list[dict]:
+ """
+ Keyword search over chunk text. Returns top_k matching chunks sorted by
+ term overlap (simple TF-style scoring, no embeddings required).
+
+ Returns list of dicts with keys: chunk_id, doc_id, score, text.
+ """
+ store = _load()
+ query_tokens = set(_tokenize(query))
+ if not query_tokens:
+ return []
+
+ scored: list[tuple[float, dict]] = []
+ for doc in store["documents"].values():
+ for chunk in doc["chunks"]:
+ chunk_tokens = set(_tokenize(chunk["text"]))
+ overlap = len(query_tokens & chunk_tokens)
+ if overlap > 0:
+ score = overlap / len(query_tokens)
+ scored.append((score, {
+ "chunk_id": chunk["chunk_id"],
+ "doc_id": doc["id"],
+ "doc_title": doc["title"],
+ "score": round(score, 4),
+ "text": chunk["text"],
+ }))
+
+ scored.sort(key=lambda x: x[0], reverse=True)
+ return [item for _, item in scored[:top_k]]
+
+
+def list_documents() -> list[dict]:
+ """Return a summary list of all stored documents (no content, no chunks)."""
+ store = _load()
+ return [
+ {
+ "id": doc["id"],
+ "title": doc["title"],
+ "source": doc["source"],
+ "chunk_count": len(doc["chunks"]),
+ "ingestion_date": doc["ingestion_date"],
+ }
+ for doc in store["documents"].values()
+ ]
diff --git a/skills/mini-context-graph/scripts/tools/graph_store.py b/skills/mini-context-graph/scripts/tools/graph_store.py
new file mode 100644
index 000000000..2412fc1f1
--- /dev/null
+++ b/skills/mini-context-graph/scripts/tools/graph_store.py
@@ -0,0 +1,202 @@
+"""
+graph_store.py — Persistent storage for graph nodes and edges.
+
+Handles:
+- Adding/deduplicating nodes
+- Adding edges with confidence
+- Fetching neighbors
+- Persisting to graph.json
+"""
+from __future__ import annotations
+
+import json
+import os
+import sys
+import uuid
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+import config
+
+_DATA_DIR = Path(os.environ.get("MINI_CONTEXT_GRAPH_DATA_DIR", str(config.DATA_DIR)))
+_GRAPH_FILE = _DATA_DIR / "graph.json"
+
+
+def _load() -> dict:
+ if _GRAPH_FILE.exists():
+ with open(_GRAPH_FILE, "r") as f:
+ return json.load(f)
+ return {"nodes": {}, "edges": []}
+
+
+def _save(graph: dict) -> None:
+ _DATA_DIR.mkdir(parents=True, exist_ok=True)
+ with open(_GRAPH_FILE, "w") as f:
+ json.dump(graph, f, indent=2)
+
+
+def add_node(
+ name: str,
+ node_type: str,
+ source_document: str | None = None,
+ source_chunks: list[str] | None = None,
+) -> str:
+ """
+ Add a node if it doesn't exist. Returns node_id.
+
+ Args:
+ source_document: doc_id from documents_store (provenance pointer).
+ source_chunks: list of chunk_ids that mention this entity.
+ """
+ graph = _load()
+ name_lower = name.strip().lower()
+
+ # Deduplication: search by normalized name
+ for node_id, node in graph["nodes"].items():
+ if node["name"] == name_lower:
+ # Merge provenance if new info provided
+ changed = False
+ if source_document and node.get("source_document") is None:
+ node["source_document"] = source_document
+ changed = True
+ if source_chunks:
+ existing = set(node.get("source_chunks") or [])
+ merged = list(existing | set(source_chunks))
+ if merged != list(existing):
+ node["source_chunks"] = merged
+ changed = True
+ if changed:
+ _save(graph)
+ return node_id
+
+ node_id = str(uuid.uuid4())[:8]
+ graph["nodes"][node_id] = {
+ "name": name_lower,
+ "type": node_type.strip().lower(),
+ "source_document": source_document,
+ "source_chunks": source_chunks or [],
+ }
+ _save(graph)
+ return node_id
+
+
+def add_edge(
+ source_id: str,
+ target_id: str,
+ relation: str,
+ confidence: float,
+ source_document: str | None = None,
+ supporting_text: str | None = None,
+ chunk_id: str | None = None,
+) -> None:
+ """
+ Add a directed edge between two nodes.
+
+ Args:
+ source_document: doc_id from documents_store (provenance pointer).
+ supporting_text: The exact text span that supports this relation.
+ chunk_id: The specific chunk_id the supporting text came from.
+ """
+ graph = _load()
+
+ # Deduplicate edges by source + target + relation
+ relation_lower = relation.strip().lower()
+ for edge in graph["edges"]:
+ if (
+ edge["source"] == source_id
+ and edge["target"] == target_id
+ and edge["type"] == relation_lower
+ ):
+ changed = False
+ if confidence > edge["confidence"]:
+ edge["confidence"] = confidence
+ changed = True
+ if source_document and edge.get("source_document") is None:
+ edge["source_document"] = source_document
+ changed = True
+ if supporting_text and edge.get("supporting_text") is None:
+ edge["supporting_text"] = supporting_text
+ changed = True
+ if chunk_id and edge.get("chunk_id") is None:
+ edge["chunk_id"] = chunk_id
+ changed = True
+ if changed:
+ _save(graph)
+ return
+
+ graph["edges"].append({
+ "source": source_id,
+ "target": target_id,
+ "type": relation_lower,
+ "confidence": confidence,
+ "source_document": source_document,
+ "supporting_text": supporting_text,
+ "chunk_id": chunk_id,
+ })
+ _save(graph)
+
+
+def get_neighbors(node_id: str, min_confidence: float = 0.0) -> list[str]:
+ """Return node_ids of all neighbors reachable from node_id."""
+ graph = _load()
+ neighbors = []
+ for edge in graph["edges"]:
+ if edge["confidence"] < min_confidence:
+ continue
+ if edge["source"] == node_id:
+ neighbors.append(edge["target"])
+ elif edge["target"] == node_id:
+ neighbors.append(edge["source"])
+ return list(set(neighbors))
+
+
+def get_node(node_id: str) -> dict | None:
+ """Fetch a single node by ID."""
+ graph = _load()
+ return graph["nodes"].get(node_id)
+
+
+def get_subgraph(node_ids: list[str]) -> dict:
+ """Return nodes and edges induced by the given node_ids."""
+ graph = _load()
+ node_id_set = set(node_ids)
+
+ nodes = {nid: graph["nodes"][nid] for nid in node_ids if nid in graph["nodes"]}
+ edges = [
+ e
+ for e in graph["edges"]
+ if e["source"] in node_id_set and e["target"] in node_id_set
+ ]
+ return {"nodes": nodes, "edges": edges}
+
+
+def find_node_by_name(name: str) -> str | None:
+ """Return node_id for a given normalized name, or None."""
+ graph = _load()
+ name_lower = name.strip().lower()
+ for node_id, node in graph["nodes"].items():
+ if node["name"] == name_lower:
+ return node_id
+ return None
+
+
+def link_node_to_source(node_id: str, doc_id: str, chunk_ids: list[str]) -> None:
+ """Attach provenance (doc_id + chunk_ids) to an existing node."""
+ graph = _load()
+ if node_id not in graph["nodes"]:
+ return
+ node = graph["nodes"][node_id]
+ node["source_document"] = doc_id
+ existing = set(node.get("source_chunks") or [])
+ node["source_chunks"] = list(existing | set(chunk_ids))
+ _save(graph)
+
+
+def get_node_sources(node_id: str) -> dict:
+ """Return provenance info (source_document + source_chunks) for a node."""
+ graph = _load()
+ node = graph["nodes"].get(node_id, {})
+ return {
+ "source_document": node.get("source_document"),
+ "source_chunks": node.get("source_chunks", []),
+ }
diff --git a/skills/mini-context-graph/scripts/tools/index_store.py b/skills/mini-context-graph/scripts/tools/index_store.py
new file mode 100644
index 000000000..2b6c39d1e
--- /dev/null
+++ b/skills/mini-context-graph/scripts/tools/index_store.py
@@ -0,0 +1,90 @@
+"""
+index_store.py — Maintains entity and keyword indexes for fast lookup.
+
+Handles:
+- Entity index: name → [node_ids]
+- Keyword index: token → [node_ids]
+- Persist to index.json
+"""
+from __future__ import annotations
+
+import json
+import os
+import re
+import sys
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+import config
+
+_DATA_DIR = Path(os.environ.get("MINI_CONTEXT_GRAPH_DATA_DIR", str(config.DATA_DIR)))
+_INDEX_FILE = _DATA_DIR / "index.json"
+
+_STOPWORDS = frozenset(
+ [
+ "a", "an", "the", "is", "are", "was", "were", "be", "been", "being",
+ "have", "has", "had", "do", "does", "did", "will", "would", "could",
+ "should", "may", "might", "shall", "can", "to", "of", "in", "on",
+ "at", "by", "for", "with", "from", "and", "or", "but", "not", "it",
+ "its", "this", "that", "these", "those", "i", "you", "he", "she",
+ "we", "they", "what", "which", "who", "how", "why", "when", "where",
+ ]
+)
+
+
+def _load() -> dict:
+ if _INDEX_FILE.exists():
+ with open(_INDEX_FILE, "r") as f:
+ return json.load(f)
+ return {"entity_index": {}, "keyword_index": {}}
+
+
+def _save(index: dict) -> None:
+ _DATA_DIR.mkdir(parents=True, exist_ok=True)
+ with open(_INDEX_FILE, "w") as f:
+ json.dump(index, f, indent=2)
+
+
+def _tokenize(text: str) -> list[str]:
+ """Split text into lowercase tokens, removing stopwords and short tokens."""
+ tokens = re.findall(r"[a-z0-9]+", text.lower())
+ return [t for t in tokens if t not in _STOPWORDS and len(t) > 1]
+
+
+def add_entity(name: str, node_id: str) -> None:
+ """Register an entity name → node_id in both entity and keyword indexes."""
+ index = _load()
+ name_lower = name.strip().lower()
+
+ # Entity index
+ if name_lower not in index["entity_index"]:
+ index["entity_index"][name_lower] = []
+ if node_id not in index["entity_index"][name_lower]:
+ index["entity_index"][name_lower].append(node_id)
+
+ # Keyword index
+ for token in _tokenize(name_lower):
+ if token not in index["keyword_index"]:
+ index["keyword_index"][token] = []
+ if node_id not in index["keyword_index"][token]:
+ index["keyword_index"][token].append(node_id)
+
+ _save(index)
+
+
+def search(query: str) -> list[str]:
+ """Search for node_ids matching the query via entity name or keywords."""
+ index = _load()
+ query_lower = query.strip().lower()
+ matched_ids: set[str] = set()
+
+ # Exact entity name match
+ if query_lower in index["entity_index"]:
+ matched_ids.update(index["entity_index"][query_lower])
+
+ # Keyword match
+ for token in _tokenize(query_lower):
+ if token in index["keyword_index"]:
+ matched_ids.update(index["keyword_index"][token])
+
+ return list(matched_ids)
diff --git a/skills/mini-context-graph/scripts/tools/ontology_store.py b/skills/mini-context-graph/scripts/tools/ontology_store.py
new file mode 100644
index 000000000..cccdfca74
--- /dev/null
+++ b/skills/mini-context-graph/scripts/tools/ontology_store.py
@@ -0,0 +1,175 @@
+"""
+ontology_store.py — Tracks entity types and relation types.
+
+Handles:
+- Registering types and relations with usage counts
+- Normalizing types and relations via synonym mapping
+- Persisting to ontology.json
+
+NOTE: No LLM logic here. Normalization is rule-based (lowercase + synonym map).
+"""
+
+import json
+import os
+import sys
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+import config
+
+_DATA_DIR = Path(os.environ.get("MINI_CONTEXT_GRAPH_DATA_DIR", str(config.DATA_DIR)))
+_ONTOLOGY_FILE = _DATA_DIR / "ontology.json"
+
+# Synonym maps — lowercase variants map to canonical forms
+_ENTITY_TYPE_MAP: dict[str, str] = {
+ "component": "component",
+ "module": "component",
+ "class": "component",
+ "function": "component",
+ "method": "component",
+ "bug": "issue",
+ "defect": "issue",
+ "fault": "issue",
+ "error": "issue",
+ "failure": "issue",
+ "problem": "issue",
+ "crash": "issue",
+ "server": "infrastructure",
+ "host": "infrastructure",
+ "machine": "infrastructure",
+ "node": "infrastructure",
+ "user": "actor",
+ "person": "actor",
+ "operator": "actor",
+ "admin": "actor",
+ "administrator": "actor",
+ "actor": "actor",
+ "app": "software",
+ "application": "software",
+ "service": "software",
+ "program": "software",
+ "software": "software",
+ "database": "storage",
+ "datastore": "storage",
+ "db": "storage",
+ "storage": "storage",
+ "api": "interface",
+ "endpoint": "interface",
+ "interface": "interface",
+ "connection": "interface",
+ "event": "event",
+ "incident": "event",
+ "occurrence": "event",
+ "trigger": "event",
+ "concept": "concept",
+ "idea": "concept",
+ "principle": "concept",
+ "theory": "concept",
+ "process": "process",
+ "thread": "process",
+ "task": "process",
+ "job": "process",
+ "workflow": "process",
+ "object": "component",
+ "resource": "component",
+ "memory": "resource",
+ "cpu": "resource",
+ "system": "system",
+ "platform": "system",
+ "framework": "system",
+ "library": "software",
+ "package": "software",
+}
+
+_RELATION_TYPE_MAP: dict[str, str] = {
+ "causes": "causes",
+ "triggers": "causes",
+ "leads to": "causes",
+ "results in": "causes",
+ "produces": "causes",
+ "is part of": "contains",
+ "belongs to": "contains",
+ "lives in": "contains",
+ "sits in": "contains",
+ "contains": "contains",
+ "depends on": "depends on",
+ "requires": "depends on",
+ "needs": "depends on",
+ "uses": "uses",
+ "calls": "uses",
+ "invokes": "uses",
+ "consumes": "uses",
+ "affects": "affects",
+ "impacts": "affects",
+ "influences": "affects",
+ "creates": "creates",
+ "instantiates": "creates",
+ "spawns": "creates",
+ "connects to": "connects to",
+ "links to": "connects to",
+ "references": "connects to",
+ "inherits from": "extends",
+ "extends": "extends",
+ "subclasses": "extends",
+ "reads from": "reads from",
+ "queries": "reads from",
+ "fetches": "reads from",
+ "writes to": "writes to",
+ "stores in": "writes to",
+ "persists to": "writes to",
+ "contributes to": "contributes to",
+ "allocated by": "allocated by",
+ "released by": "released by",
+ "not released": "not released",
+}
+
+
+def _load() -> dict:
+ if _ONTOLOGY_FILE.exists():
+ with open(_ONTOLOGY_FILE, "r") as f:
+ return json.load(f)
+ return {"entity_types": {}, "relation_types": {}}
+
+
+def _save(ontology: dict) -> None:
+ _DATA_DIR.mkdir(parents=True, exist_ok=True)
+ with open(_ONTOLOGY_FILE, "w") as f:
+ json.dump(ontology, f, indent=2)
+
+
+def normalize_type(type_name: str) -> str:
+ """Return the canonical form of an entity type."""
+ key = type_name.strip().lower().replace("-", " ").replace("_", " ")
+ return _ENTITY_TYPE_MAP.get(key, key)
+
+
+def normalize_relation(relation_name: str) -> str:
+ """Return the canonical form of a relation type."""
+ key = relation_name.strip().lower().replace("-", " ").replace("_", " ")
+ return _RELATION_TYPE_MAP.get(key, key)
+
+
+def add_type(type_name: str) -> None:
+ """Register an entity type, incrementing its usage count."""
+ ontology = _load()
+ canonical = normalize_type(type_name)
+ ontology["entity_types"][canonical] = ontology["entity_types"].get(canonical, 0) + 1
+ _save(ontology)
+
+
+def add_relation(relation_name: str) -> None:
+ """Register a relation type, incrementing its usage count."""
+ ontology = _load()
+ canonical = normalize_relation(relation_name)
+ ontology["relation_types"][canonical] = ontology["relation_types"].get(canonical, 0) + 1
+ _save(ontology)
+
+
+def get_all_types() -> dict[str, int]:
+ """Return all registered entity types with counts."""
+ return _load()["entity_types"]
+
+
+def get_all_relations() -> dict[str, int]:
+ """Return all registered relation types with counts."""
+ return _load()["relation_types"]
diff --git a/skills/mini-context-graph/scripts/tools/retrieval_engine.py b/skills/mini-context-graph/scripts/tools/retrieval_engine.py
new file mode 100644
index 000000000..ca66f7436
--- /dev/null
+++ b/skills/mini-context-graph/scripts/tools/retrieval_engine.py
@@ -0,0 +1,58 @@
+"""
+retrieval_engine.py — BFS-based graph traversal for context retrieval.
+
+Input: seed node_ids + depth
+Output: list of node_ids within traversal depth filtered by min_confidence
+"""
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+from collections import deque
+
+# Allow imports from parent package
+sys.path.insert(0, str(Path(__file__).parent.parent))
+
+from tools import graph_store
+import config
+
+
+def retrieve(
+ seed_node_ids: list[str],
+ depth: int = config.MAX_GRAPH_DEPTH,
+ min_confidence: float = config.MIN_CONFIDENCE,
+ max_nodes: int = config.MAX_NODES,
+) -> list[str]:
+ """
+ BFS from seed nodes up to `depth` hops.
+
+ Returns a list of node_ids (including seeds) within the traversal,
+ filtered by min_confidence on edges and capped at max_nodes.
+ """
+ visited: set[str] = set()
+ # Queue items: (node_id, current_depth)
+ queue: deque[tuple[str, int]] = deque()
+
+ for seed in seed_node_ids:
+ if seed not in visited:
+ visited.add(seed)
+ queue.append((seed, 0))
+
+ while queue:
+ if len(visited) >= max_nodes:
+ break
+
+ node_id, current_depth = queue.popleft()
+
+ if current_depth >= depth:
+ continue
+
+ neighbors = graph_store.get_neighbors(node_id, min_confidence=min_confidence)
+ for neighbor in neighbors:
+ if neighbor not in visited:
+ visited.add(neighbor)
+ queue.append((neighbor, current_depth + 1))
+ if len(visited) >= max_nodes:
+ break
+
+ return list(visited)
diff --git a/skills/mini-context-graph/scripts/tools/wiki_store.py b/skills/mini-context-graph/scripts/tools/wiki_store.py
new file mode 100644
index 000000000..eccf2f749
--- /dev/null
+++ b/skills/mini-context-graph/scripts/tools/wiki_store.py
@@ -0,0 +1,294 @@
+"""
+wiki_store.py — Manages the persistent wiki layer.
+
+Inspired by Karpathy's LLM Wiki pattern: the wiki is a directory of LLM-generated
+markdown pages that the agent writes and maintains. This module provides the
+deterministic file I/O and index/log management so the agent can focus on
+reasoning, not bookkeeping.
+
+Wiki structure (relative to project root):
+ wiki/
+ index.md ← content-oriented catalog of all pages
+ log.md ← chronological append-only operation log
+ entities/ ← one page per entity (person, concept, system, etc.)
+ summaries/ ← source document summary pages
+ topics/ ← cross-cutting synthesis and topic pages
+
+The agent WRITES pages; this module handles the filesystem + index + log.
+"""
+from __future__ import annotations
+
+import os
+import re
+import sys
+from datetime import datetime, timezone
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+import config
+
+_WIKI_DIR = Path(os.environ.get("MINI_CONTEXT_GRAPH_WIKI_DIR", str(config.WIKI_DIR)))
+_INDEX_FILE = _WIKI_DIR / "index.md"
+_LOG_FILE = _WIKI_DIR / "log.md"
+
+_CATEGORY_DIRS = {
+ "entity": _WIKI_DIR / "entities",
+ "summary": _WIKI_DIR / "summaries",
+ "topic": _WIKI_DIR / "topics",
+}
+
+# ---------------------------------------------------------------------------
+# Internal helpers
+# ---------------------------------------------------------------------------
+
+def _ensure_dirs() -> None:
+ _WIKI_DIR.mkdir(parents=True, exist_ok=True)
+ for d in _CATEGORY_DIRS.values():
+ d.mkdir(parents=True, exist_ok=True)
+
+
+def _now_iso() -> str:
+ return datetime.now(timezone.utc).strftime("%Y-%m-%d")
+
+
+def _slug(title: str) -> str:
+ """Convert a title to a filesystem-safe slug."""
+ slug = title.lower().strip()
+ slug = re.sub(r"[^a-z0-9]+", "-", slug)
+ return slug.strip("-")
+
+
+def _page_path(category: str, slug: str) -> Path:
+ base = _CATEGORY_DIRS.get(category, _WIKI_DIR)
+ return base / f"{slug}.md"
+
+
+# ---------------------------------------------------------------------------
+# Index management
+# ---------------------------------------------------------------------------
+
+def _load_index() -> list[dict]:
+ """Parse index.md into a list of entry dicts."""
+ if not _INDEX_FILE.exists():
+ return []
+ entries = []
+ for line in _INDEX_FILE.read_text().splitlines():
+ # Expected table row: | [[slug]] | category | summary | date |
+ if line.startswith("| [["):
+ parts = [p.strip() for p in line.split("|") if p.strip()]
+ if len(parts) >= 3:
+ link = parts[0] # [[slug]]
+ category = parts[1] if len(parts) > 1 else ""
+ summary = parts[2] if len(parts) > 2 else ""
+ date = parts[3] if len(parts) > 3 else ""
+ slug = re.sub(r"\[\[|\]\]", "", link)
+ entries.append({
+ "slug": slug,
+ "category": category,
+ "summary": summary,
+ "date": date,
+ })
+ return entries
+
+
+def _save_index(entries: list[dict]) -> None:
+ """Rewrite index.md from the entries list."""
+ _ensure_dirs()
+ lines = [
+ "# Wiki Index\n",
+ "_Auto-managed by wiki_store. Do not edit the table manually._\n\n",
+ "| Page | Category | Summary | Date |\n",
+ "|------|----------|---------|------|\n",
+ ]
+ for e in entries:
+ lines.append(
+ f"| [[{e['slug']}]] | {e['category']} | {e['summary']} | {e['date']} |\n"
+ )
+ _INDEX_FILE.write_text("".join(lines))
+
+
+def _append_log(operation: str, detail: str) -> None:
+ """Append a timestamped entry to log.md."""
+ _ensure_dirs()
+ timestamp = datetime.now(timezone.utc).strftime("%Y-%m-%d")
+ entry = f"\n## [{timestamp}] {operation} | {detail}\n"
+ with open(_LOG_FILE, "a") as f:
+ f.write(entry)
+
+
+# ---------------------------------------------------------------------------
+# Public API
+# ---------------------------------------------------------------------------
+
+def write_page(
+ category: str,
+ title: str,
+ content: str,
+ summary: str = "",
+) -> str:
+ """
+ Write (or overwrite) a wiki page.
+
+ The agent provides the full markdown content. This method handles:
+ - Writes the .md file to the appropriate category subfolder.
+ - Updates index.md with a one-line entry.
+ - Appends an entry to log.md.
+
+ Args:
+ category: One of "entity", "summary", "topic".
+ title: Human-readable page title (used for slug + index).
+ content: Full markdown content the agent wrote.
+ summary: One-line summary for the index (optional; auto-extracted if empty).
+
+ Returns:
+ Relative path from wiki root (e.g. "entities/memory-leak.md").
+ """
+ _ensure_dirs()
+ slug = _slug(title)
+ path = _page_path(category, slug)
+
+ # Auto-extract first non-heading, non-empty line as summary if not provided
+ if not summary:
+ for line in content.splitlines():
+ stripped = line.strip()
+ if stripped and not stripped.startswith("#"):
+ summary = stripped[:100]
+ break
+
+ path.write_text(content)
+
+ # Update index
+ entries = _load_index()
+ existing = next((e for e in entries if e["slug"] == slug), None)
+ if existing:
+ existing["summary"] = summary
+ existing["date"] = _now_iso()
+ else:
+ entries.append({
+ "slug": slug,
+ "category": category,
+ "summary": summary,
+ "date": _now_iso(),
+ })
+ _save_index(entries)
+ _append_log("write", title)
+
+ return str(path.relative_to(_WIKI_DIR))
+
+
+def read_page(category: str, title: str) -> str | None:
+ """Read a wiki page's content. Returns None if not found."""
+ slug = _slug(title)
+ path = _page_path(category, slug)
+ if not path.exists():
+ return None
+ return path.read_text()
+
+
+def read_page_by_slug(slug: str) -> str | None:
+ """Read a wiki page by slug, searching across all categories."""
+ for d in list(_CATEGORY_DIRS.values()) + [_WIKI_DIR]:
+ path = d / f"{slug}.md"
+ if path.exists():
+ return path.read_text()
+ return None
+
+
+def search_wiki(query: str) -> list[dict]:
+ """
+ Simple keyword search over all wiki pages.
+ Returns list of {slug, category, path, snippet} sorted by relevance.
+ """
+ query_tokens = set(re.findall(r"[a-z0-9]+", query.lower()))
+ if not query_tokens:
+ return []
+
+ results = []
+ for category, base_dir in _CATEGORY_DIRS.items():
+ if not base_dir.exists():
+ continue
+ for page_path in base_dir.glob("*.md"):
+ content = page_path.read_text().lower()
+ content_tokens = set(re.findall(r"[a-z0-9]+", content))
+ overlap = len(query_tokens & content_tokens)
+ if overlap > 0:
+ # Extract a short snippet around first match
+ first_token = next(iter(query_tokens & content_tokens), "")
+ idx = content.find(first_token)
+ snippet = content[max(0, idx - 30):idx + 80].replace("\n", " ").strip()
+ results.append({
+ "slug": page_path.stem,
+ "category": category,
+ "path": str(page_path.relative_to(_WIKI_DIR)),
+ "score": overlap,
+ "snippet": snippet,
+ })
+
+ results.sort(key=lambda x: x["score"], reverse=True)
+ return results
+
+
+def list_pages(category: str | None = None) -> list[dict]:
+ """List all wiki pages, optionally filtered by category."""
+ entries = _load_index()
+ if category:
+ return [e for e in entries if e["category"] == category]
+ return entries
+
+
+def get_log(last_n: int = 20) -> list[str]:
+ """Return the last N log entries from log.md."""
+ if not _LOG_FILE.exists():
+ return []
+ lines = _LOG_FILE.read_text().splitlines()
+ entries = [l for l in lines if l.startswith("## [")]
+ return entries[-last_n:]
+
+
+def lint_wiki() -> dict:
+ """
+ Health-check the wiki as described in Karpathy's LLM Wiki pattern.
+
+ Checks for:
+ - Orphan pages (in directory but not in index)
+ - Missing pages (in index but file deleted)
+ - Broken wikilinks ([[slug]] pointing to non-existent file)
+ - Pages with no wikilinks (isolated pages)
+
+ Returns:
+ {
+ "orphan_pages": [...],
+ "missing_pages": [...],
+ "broken_wikilinks": {slug: [broken_links]},
+ "isolated_pages": [...],
+ }
+ """
+ index_entries = {e["slug"] for e in _load_index()}
+ file_slugs: dict[str, Path] = {}
+ for d in _CATEGORY_DIRS.values():
+ if d.exists():
+ for p in d.glob("*.md"):
+ file_slugs[p.stem] = p
+
+ orphans = [s for s in file_slugs if s not in index_entries]
+ missing = [s for s in index_entries if s not in file_slugs]
+
+ broken_wikilinks: dict[str, list[str]] = {}
+ isolated: list[str] = []
+ all_slugs = set(file_slugs.keys())
+
+ for slug, path in file_slugs.items():
+ content = path.read_text()
+ links = re.findall(r"\[\[([^\]]+)\]\]", content)
+ if not links:
+ isolated.append(slug)
+ broken = [lnk for lnk in links if _slug(lnk) not in all_slugs]
+ if broken:
+ broken_wikilinks[slug] = broken
+
+ return {
+ "orphan_pages": orphans,
+ "missing_pages": missing,
+ "broken_wikilinks": broken_wikilinks,
+ "isolated_pages": isolated,
+ }
diff --git a/skills/mini-context-graph/skill.md b/skills/mini-context-graph/skill.md
new file mode 100644
index 000000000..76c383f81
--- /dev/null
+++ b/skills/mini-context-graph/skill.md
@@ -0,0 +1,194 @@
+---
+name: mini-context-graph
+description: |
+ A persistent, compounding knowledge base combining Karpathy's LLM Wiki pattern
+ with a structured knowledge graph. Ingest documents once — the LLM writes wiki
+ pages, extracts entities/relations into the graph, and stores raw content for
+ evidence retrieval. Knowledge accumulates and cross-references; it is never
+ re-derived from scratch.
+---
+
+# Mini Context Graph Skill
+
+## The Core Idea
+
+Standard RAG re-discovers knowledge from scratch on every query. This skill is different:
+
+1. **Wiki layer** — The LLM writes and maintains persistent markdown pages (summaries, entity pages, topic syntheses). Cross-references are already there. The wiki gets richer with every ingest.
+2. **Graph layer** — Entities and relations are extracted once and stored as a navigable knowledge graph. BFS traversal answers structural queries without re-reading sources.
+3. **Raw source layer** — Original documents are stored immutably with chunks. Provenance links tie every graph node and edge back to the exact text that supports it.
+
+> The LLM writes; the Python tools handle all bookkeeping.
+
+---
+
+## Three Layers
+
+| Layer | Where | What the LLM does | What Python does |
+|-------|-------|-------------------|-----------------|
+| **Raw Sources** | `data/documents.json` | Reads (never modifies) | Stores chunks + metadata |
+| **Wiki** | `wiki/` (markdown) | Writes/updates pages | Manages index.md + log.md |
+| **Graph** | `data/graph.json` | Extracts entities + relations | Persists, deduplicates, traverses |
+
+---
+
+## ⚡ Quick Start for Agents
+
+```python
+from scripts.contextgraph import ContextGraphSkill
+from scripts.tools import wiki_store
+
+skill = ContextGraphSkill()
+
+# ===== INGEST WITH FULL RAG + WIKI =====
+# 1. Read references/ingestion.md and references/ontology.md first
+# 2. Extract entities and relations (LLM reasoning step)
+entities = [
+ {"name": "memory leak", "type": "issue", "supporting_text": "memory leaks cause crashes"},
+ {"name": "system crash", "type": "issue", "supporting_text": "system crashes due to memory leaks"},
+]
+relations = [
+ {"source": "memory leak", "target": "system crash", "type": "causes",
+ "confidence": 1.0, "supporting_text": "System crashes due to memory leaks."},
+]
+
+result = skill.ingest_with_content(
+ doc_id="doc_001",
+ title="System Crash Analysis",
+ source="/docs/incident_report.pdf",
+ raw_content="System crashes due to memory leaks. Memory leaks occur when objects are not released.",
+ entities=entities,
+ relations=relations,
+)
+# result = {"doc_id": "doc_001", "chunk_count": 1, "nodes_added": 2, "edges_added": 1}
+
+# 3. Write a wiki summary page for this document
+wiki_store.write_page(
+ category="summary",
+ title="System Crash Analysis Summary",
+ content="""---
+title: System Crash Analysis
+source_document: doc_001
+tags: [summary, incident]
+---
+
+# System Crash Analysis
+
+**Source:** incident_report.pdf
+
+## Key Claims
+
+- [[memory-leak]] causes [[system-crash]] (confidence: 1.0)
+
+## Entities
+
+- [[memory-leak]] (issue)
+- [[system-crash]] (issue)
+""",
+ summary="Incident report: memory leaks cause system crashes.",
+)
+
+# ===== QUERY WITH EVIDENCE =====
+result = skill.query_with_evidence("Why does the system crash?")
+# Returns: {"query": ..., "subgraph": ..., "supporting_documents": [...], "evidence_chain": ...}
+
+# ===== WIKI SEARCH (read wiki before answering) =====
+pages = wiki_store.search_wiki("memory leak")
+# Returns: [{slug, category, path, snippet}, ...]
+```
+
+---
+
+## Operations
+
+### Ingest
+
+When a user provides a new document:
+
+1. Read `references/ingestion.md` — entity/relation extraction rules.
+2. Read `references/ontology.md` — type normalization rules.
+3. Extract entities and relations using your LLM reasoning.
+4. Call `skill.ingest_with_content(...)` — stores raw content + chunks + graph nodes + provenance.
+5. **Write a wiki summary page** using `wiki_store.write_page(category="summary", ...)`.
+6. **Update entity pages** — for each new/updated entity, write or update `wiki_store.write_page(category="entity", ...)`.
+7. **Update topic pages** if the document touches an existing synthesis topic.
+8. A single document ingest will typically touch 3–10 wiki pages.
+
+### Query
+
+When a user asks a question:
+
+1. **Check the wiki first** — `wiki_store.search_wiki(query)` to find relevant pages. Read them.
+2. If the wiki has a good answer, synthesize from wiki pages (fast path).
+3. If deeper graph traversal is needed, call `skill.query_with_evidence(query)`.
+4. Return the answer with evidence citations from `supporting_documents`.
+5. If the answer is valuable, file it back as a new wiki topic page.
+
+### Lint
+
+Periodically health-check the wiki:
+
+```python
+from scripts.tools import wiki_store
+issues = wiki_store.lint_wiki()
+# Returns: {orphan_pages, missing_pages, broken_wikilinks, isolated_pages}
+```
+
+Ask the LLM to review and fix: broken links, orphan pages, stale claims, missing cross-references. See `references/lint.md` for full lint workflow.
+
+---
+
+## Ingestion Constraints
+
+- ❌ Do NOT hallucinate entities not present in the text
+- ❌ Do NOT add relations without explicit textual evidence
+- ❌ Do NOT add edges with confidence < 0.6
+- ✅ Provide `supporting_text` for every entity and relation — this enables provenance
+- ✅ Write a wiki summary page for every ingested document
+- ✅ Update existing entity pages when new information arrives
+- ✅ Flag contradictions in wiki pages when new data conflicts with old claims
+
+---
+
+## Retrieval Constraints
+
+- 🔒 Traversal depth MUST NOT exceed 2 (config: MAX_GRAPH_DEPTH)
+- 🔒 Only edges with confidence ≥ 0.6 (config: MIN_CONFIDENCE)
+- 🔒 Maximum 50 nodes returned (config: MAX_NODES)
+- ❌ Do NOT fabricate nodes or edges not in the graph
+
+---
+
+## Full Python API Reference
+
+| Method | Purpose | When to Use |
+|--------|---------|-------------|
+| `skill.ingest_with_content(doc_id, title, source, raw_content, entities, relations)` | Full RAG ingest: raw docs + graph + provenance | Every new document |
+| `skill.add_node(name, node_type)` | Add single entity (no provenance) | Quick additions without a source doc |
+| `skill.add_edge(source_name, target_name, relation, confidence)` | Add single relation | Quick additions without a source doc |
+| `skill.query(query)` | Graph-only retrieval → subgraph | Structural queries |
+| `skill.query_with_evidence(query)` | Graph + provenance → subgraph + source chunks | Queries requiring citations |
+| `wiki_store.write_page(category, title, content, summary)` | Write/update a wiki page | After every ingest; after answering queries |
+| `wiki_store.read_page(category, title)` | Read a wiki page | Before answering; for cross-referencing |
+| `wiki_store.search_wiki(query)` | Keyword search across wiki | Fast path before graph traversal |
+| `wiki_store.list_pages(category)` | List all wiki pages | Getting an overview |
+| `wiki_store.get_log(last_n)` | Read recent operations | Understanding wiki history |
+| `wiki_store.lint_wiki()` | Health check | Periodic maintenance |
+| `documents_store.list_documents()` | List all ingested raw sources | Audit / provenance checking |
+| `documents_store.search_chunks(query)` | Chunk-level search | Finding specific evidence |
+
+---
+
+## Design Philosophy
+
+> "The wiki is a persistent, compounding artifact. The cross-references are already there. The synthesis already reflects everything you've read." — Karpathy
+
+| Layer | What Happens | Who Owns It |
+|-------|-----------|-------------|
+| **LLM Reasoning** | Extraction, synthesis, writing wiki pages | Agent (.md guidance files) |
+| **Wiki Persistence** | Index, log, file I/O | `wiki_store.py` |
+| **Graph Persistence** | Dedup, index, BFS traverse | `graph_store.py`, `retrieval_engine.py` |
+| **Raw Source Storage** | Immutable docs + chunks + provenance | `documents_store.py` |
+
+The human curates sources and asks questions. The LLM writes the wiki, extracts the graph, and answers with citations. Python handles all bookkeeping.
+