jsilvanus · Copilot · Apr 9, 2026 · Apr 9, 2026 · Apr 9, 2026 · Apr 9, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -282,7 +282,7 @@ gitsema index
 - **ORM:** Drizzle ORM (`src/core/db/schema.ts`)
 - **Add to `.gitignore`:** `.gitsema/`
 
-**Schema overview (current schema v21):**
+**Schema overview (current schema v22):**
 
 | Table | Purpose |
 |---|---|
@@ -304,8 +304,9 @@ gitsema index
 | `blob_clusters` | K-means cluster assignments |
 | `cluster_assignments` | Cluster snapshot entries per ref |
 | `module_embeddings` | Directory centroid running-mean embeddings (Phase 33) |
-| `embed_config` | Recorded embedding provenance (model, dimensions, chunker) |
+| `embed_config` | Recorded embedding provenance (model, dimensions, chunker); `kind` column distinguishes `'embedding'` vs `'narrator'` configs |
 | `indexing_checkpoints` | Resume markers for interrupted indexing runs |
+| `settings` | Key-value table for persistent settings (e.g. `active_narrator_model_config_id`) |
 
 **FTS5 note:** Blobs indexed before Phase 11 have no FTS5 content. `--hybrid` search only applies to blobs with FTS5 entries. `--include-content` in evolution dumps also depends on FTS5 content. Use `gitsema backfill-fts` to populate FTS5 content for older index entries.
 
@@ -320,7 +321,8 @@ gitsema index
 - v18 → v19: Added `embed_config` table for embedding provenance (Phase 80+)
 - v19 → v20: Added `UNIQUE (blob_hash, path)` index on `paths` table (review6 §11.6 / Phase 89)
 - v20 → v21: Hashed repo tokens at rest — `token_hash` + `token_prefix` replace plaintext `token` in `repo_tokens` (review7 §4.1)
-- **Current version: 21**
+- v21 → v22: Added `kind` + `params_json` columns to `embed_config`; added `settings` table (narrator model config)
+- **Current version: 22**
 
 Schema changes require updating both `src/core/db/schema.ts` and the migration logic in `src/core/db/sqlite.ts`.
 

diff --git a/docs/PLAN.md b/docs/PLAN.md
@@ -3114,3 +3114,64 @@ embedding provider (Ollama, OpenAI-compatible HTTP, embedeer). This enables:
 **Documentation:** `CLAUDE.md` schema overview updated to v21 + migration v20→v21 entry added. `docs/deploy.md` table of contents updated with §11.
 
 **Status:** ✅ complete.
+
+---
+
+## Phase 91 — LLM Narrator/Explainer via chattydeer (DB-backed config)
+
+**Status:** ✅ complete.
+
+**Goals:**
+- Add LLM-powered `gitsema narrate` and `gitsema explain` commands.
+- Use `@jsilvanus/chattydeer` (split-out from embedeer) for LLM narration.
+- Store narrator model configs in the DB (embed_config table, `kind='narrator'`).
+- Manage narrator models via the existing `gitsema models` system.
+- HTTP parity: `POST /api/v1/narrate`, `POST /api/v1/explain`.
+- MCP parity: `narrate_repo`, `explain_issue_or_error`.
+- Safe-by-default: no remote calls unless configured; redaction before every LLM call.
+- Auditable output with commit hash citations.
+
+**Schema changes (v22):**
+- `embed_config.kind TEXT DEFAULT 'embedding'` — distinguishes embedding vs narrator configs.
+- `embed_config.params_json TEXT` — narrator-specific params (httpUrl, apiKey, maxTokens, temperature).
+- `settings` table — key-value table for persistent settings (`active_narrator_model_config_id`).
+- `CURRENT_SCHEMA_VERSION` bumped to **22**.
+
+**New packages:**
+- `@jsilvanus/chattydeer@^0.2.0` — added (LLM explainer/narrator, split from embedeer).
+- `@jsilvanus/embedeer@^1.3.2` — updated to latest.
+
+**New modules:**
+- `src/core/narrator/types.ts` — NarratorProvider interface, NarratorModelConfig, etc.
+- `src/core/narrator/redact.ts` — secret-pattern redaction (10 patterns).
+- `src/core/narrator/audit.ts` — structured audit logging per narration call.
+- `src/core/narrator/chattydeerProvider.ts` — ChattydeerNarratorProvider adapter.
+- `src/core/narrator/resolveNarrator.ts` — DB-backed config CRUD + active narrator resolution.
+- `src/core/narrator/narrator.ts` — git log parsing, event classification, map-reduce summarisation.
+- `src/core/narrator/index.ts` — barrel exports.
+
+**New CLI commands:**
+- `gitsema narrate [--since] [--until] [--range] [--focus] [--format] [--max-commits] [--narrator-model-id] [--model]`
+- `gitsema explain <topic> [--since] [--until] [--log] [--format] [--narrator-model-id] [--model]`
+- `gitsema models narrator-list [--json]`
+- `gitsema models narrator-add <name> --http-url <url> [--key] [--max-tokens] [--temperature] [--activate]`
+- `gitsema models narrator-activate <name>`
+- `gitsema models narrator-remove <name>`
+
+**New HTTP routes (under /api/v1/):**
+- `POST /narrate`
+- `POST /explain`
+
+**New MCP tools:**
+- `narrate_repo`
+- `explain_issue_or_error`
+
+**Tests:**
+- `tests/narratorRedact.test.ts` — 19 tests (redaction patterns).
+- `tests/narratorConfig.test.ts` — 15 tests (DB-backed config CRUD, active selection).
+- `tests/narratorSmoke.test.ts` — 9 tests (CLI handler shape invariants).
+- All 787 tests pass.
+
+**Documentation:**
+- `docs/plan_LLM.md` — full implementation plan (goals, API surface, pipeline, security, tests, schema).
+- `CLAUDE.md` schema overview updated to v22 + migration v21→v22 entry added.
diff --git a/docs/chattydeer_contract.md b/docs/chattydeer_contract.md
@@ -0,0 +1,201 @@
+# `@jsilvanus/chattydeer` — Contract for gitsema-guide function calling
+
+> This document describes the additional API surface that gitsema requires from
+> `@jsilvanus/chattydeer` to power the `gitsema guide` interactive chat with
+> full function-call (tool-call) execution.  
+> The current chattydeer version (`0.2.0`) satisfies narration/explain.  
+> Everything in this document is **pending** — it will be implemented in
+> chattydeer once this contract is agreed.
+
+---
+
+## Background
+
+`gitsema guide` needs an **agentic loop**:
+
+1. User asks a question.
+2. The LLM decides which gitsema tools to call (e.g. `semantic_search`,
+   `recent_commits`, `file-evolution`).
+3. chattydeer executes the tool calls and feeds results back.
+4. Repeat until the LLM returns a final answer.
+
+The current chattydeer `Explainer` API (single-shot text generation) is
+sufficient for narrate/explain but cannot support this loop.
+
+---
+
+## Required additions to `@jsilvanus/chattydeer`
+
+### 1. `ChatSession` — multi-turn conversation object
+
+```typescript
+interface ChatMessage {
+  role: 'system' | 'user' | 'assistant' | 'tool'
+  content: string
+  /** Present when role='tool' — the name of the tool that produced this result. */
+  toolName?: string
+  /** Present when role='assistant' — the tool calls the LLM wants to make. */
+  toolCalls?: ToolCall[]
+}
+
+interface ToolCall {
+  id: string           // unique per call (passed back in tool result)
+  name: string         // tool name (e.g. 'semantic_search')
+  arguments: Record<string, unknown>  // parsed JSON args
+}
+
+interface ChatSession {
+  messages: ChatMessage[]
+  append(msg: ChatMessage): void
+}
+```
+
+### 2. `ChatCompletionProvider` — streaming + tool-calling chat
+
+```typescript
+interface ToolDefinition {
+  name: string
+  description: string
+  /** JSON Schema object describing the parameters. */
+  parameters: Record<string, unknown>
+}
+
+interface ChatCompletionRequest {
+  session: ChatSession
+  tools?: ToolDefinition[]
+  /** Max tokens for this completion turn. */
+  maxTokens?: number
+  /** Temperature (0 = deterministic). */
+  temperature?: number
+  /** If true, stream token deltas via AsyncIterable<string>. */
+  stream?: boolean
+}
+
+interface ChatCompletionResponse {
+  message: ChatMessage        // the assistant's response (text or tool_calls)
+  tokensUsed: number
+  finishReason: 'stop' | 'tool_calls' | 'length' | 'error'
+}
+
+interface ChatCompletionProvider {
+  /** Single completion turn (non-streaming). */
+  complete(req: ChatCompletionRequest): Promise<ChatCompletionResponse>
+  /** Streaming completion; yields token deltas until finish. */
+  stream(req: ChatCompletionRequest): AsyncIterable<{ delta: string; done: boolean }>
+  destroy(): Promise<void>
+}
+```
+
+### 3. `AgentLoop` — agentic tool-execution helper
+
+```typescript
+interface AgentLoopOptions {
+  provider: ChatCompletionProvider
+  tools: ToolDefinition[]
+  /** Callback invoked by AgentLoop to execute a tool call. */
+  executeTool(call: ToolCall): Promise<string>
+  /** Maximum number of LLM → tool → LLM roundtrips (default: 5). */
+  maxRoundtrips?: number
+  /** Callback for observing intermediate messages (optional). */
+  onMessage?: (msg: ChatMessage) => void
+}
+
+interface AgentLoopResult {
+  answer: string          // final assistant text
+  messages: ChatMessage[] // full conversation history
+  roundtrips: number      // number of tool-call roundtrips used
+}
+
+/** Run an agentic loop until the LLM produces a final answer (no more tool_calls). */
+function runAgentLoop(session: ChatSession, opts: AgentLoopOptions): Promise<AgentLoopResult>
+```
+
+### 4. Factory function
+
+```typescript
+/**
+ * Create a ChatCompletionProvider backed by any OpenAI-compatible endpoint.
+ *
+ * @param httpUrl  Base URL (e.g. 'https://api.openai.com')
+ * @param model    Model name (e.g. 'gpt-4o-mini')
+ * @param apiKey   Optional bearer token
+ */
+function createChatProvider(
+  httpUrl: string,
+  model: string,
+  apiKey?: string,
+): ChatCompletionProvider
+```
+
+### 5. OpenAI-compatible `/v1/chat/completions` pass-through (optional)
+
+For `gitsema tools serve` to expose a full OpenAI-compatible HTTP endpoint,
+chattydeer should optionally provide an Express middleware / handler:
+
+```typescript
+/**
+ * Returns an Express RequestHandler that proxies POST /v1/chat/completions
+ * requests to the configured provider, with optional tool injection.
+ *
+ * gitsema uses this to expose its tool registry as OpenAI function calls
+ * to any compatible client (e.g. Claude Desktop, Continue.dev).
+ */
+function createOpenAiChatHandler(
+  provider: ChatCompletionProvider,
+  tools?: ToolDefinition[],
+  executeTool?: (call: ToolCall) => Promise<string>,
+): RequestHandler
+```
+
+---
+
+## gitsema tool registry (to be exposed as function calls)
+
+The following gitsema internal tools will be registered with the agent loop:
+
+| Tool name             | Description                                   | Key parameters           |
+|-----------------------|-----------------------------------------------|--------------------------|
+| `semantic_search`     | Vector similarity search over git history     | `query`, `topK`          |
+| `recent_commits`      | Fetch N most recent commits                   | `n`                      |
+| `file_evolution`      | Semantic drift of a single file               | `path`, `since`, `until` |
+| `concept_evolution`   | Concept drift across the codebase             | `query`, `topK`          |
+| `repo_stats`          | Basic repository statistics                   | —                        |
+| `narrate_repo`        | Return commit evidence for a range            | `since`, `until`, `focus`|
+| `explain_topic`       | Return commits matching a topic               | `topic`, `since`, `until`|
+| `branch_summary`      | Semantic summary of a branch vs main          | `branch`                 |
+
+---
+
+## Redaction requirement
+
+Before any user/tool content is sent to a remote provider, the chattydeer
+`AgentLoop` must support a `redactContent` hook:
+
+```typescript
+interface AgentLoopOptions {
+  // ... (existing fields)
+  /** Optional: called on every message before it leaves gitsema. Modify in place. */
+  redactContent?: (text: string) => string
+}
+```
+
+gitsema will wire `redactAll` from `src/core/narrator/redact.ts` here.
+
+---
+
+## Versioning
+
+- Targeting chattydeer `>= 0.3.0` for `ChatCompletionProvider` + `AgentLoop`.
+- OpenAI pass-through handler targeted for `>= 0.4.0`.
+- gitsema will pin `@jsilvanus/chattydeer@^0.3.0` once these are released.
+
+---
+
+## Acceptance criteria for gitsema
+
+- [ ] `gitsema guide "What does the auth module do?"` performs semantic_search, feeds results to LLM, returns answer.
+- [ ] `gitsema guide --interactive` supports multi-turn conversation with tool calls.
+- [ ] `POST /api/v1/guide/chat` uses the agent loop (same tools, same redaction).
+- [ ] `POST /v1/chat/completions` on the gitsema HTTP server is OpenAI-compatible.
+- [ ] All tool call arguments and results are redacted before leaving the process.
+- [ ] Agent loop terminates within `maxRoundtrips` (default 5).