Local-first persistent memory engine for AI coding agents.
Why: AI agents lose context between sessions. Agentrete remembers your preferences, project decisions, and past pitfalls — automatically recalled in future conversations.
How: Single Rust binary. Embedded SQLite + sqlite-vec KNN + Model2Vec (131MB minilm-256d default). Exposes MCP tools over HTTP or stdio. Cross-platform hooks for Codex CLI, Claude Code, and more.
# 1. Install hooks + configure AI tools
npx agentrete setup
# 2. Start background service
npx agentrete daemon install
# 3. Verify
npx agentrete statsSetup auto-detects your AI tools (Codex, Claude, Cursor, etc.), writes MCP config, and installs hooks.
Daemon registers a systemd/launchd service that auto-starts on boot.
Stats should show Memories: 0 — ready. Restart your AI tool and start working.
git clone git@github.com:dyrnq/agentrete.git
cd agentrete
cargo build
./target/debug/agentrete daemon install --port 9092
./target/debug/agentrete setup- Restart your AI tool
- Hooks auto-search relevant memories on session start
- Use
memory_saveto persist decisions, patterns, bugs - Run
memory_statsto see what's stored
Want semantic search? Add a model to
~/.agentrete/config.toml:[embedding] backend = "model2vec" [embedding.model2vec] model = "minilm-256d" dims = 256Without this, agentrete uses fast BM25 text search — no model download needed.
| Command | Description |
|---|---|
save |
Save a memory |
search |
Semantic search (vec0 KNN + FTS5 BM25 → RRF fusion) |
list |
List recent memories |
stats |
Database statistics |
forget |
Delete by ID |
wipe |
Delete all memories |
init |
Initialize project |
doctor |
Run diagnostics |
setup |
Auto-detect AI tools and configure MCP + hooks |
daemon |
OS-native background service (systemd/launchd) |
mcp |
Start MCP server (HTTP or stdio) |
scan |
Scan codebase and build knowledge graph |
| Tool | Description |
|---|---|
memory_search |
Semantic search (vec0 KNN + FTS5 BM25 → RRF fusion + temporal decay) |
memory_save |
Save memory with auto-detect project from git, dry_run preview |
memory_list |
List recent memories, optionally filtered by type |
memory_forget |
Delete by ID |
memory_stats |
DB statistics (schema version, type counts, model info, vec0 status) |
memory_compact |
Deduplicate (exact or semantic by cosine threshold) + reclaim disk |
kg_query |
Knowledge graph query (neighbors, path, subgraph by predicate/direction/project) |
kg_scan |
Start background codebase scan with ast-grep (incremental via file hash cache) |
kg_scan_status |
Check if a background scan is running |
- Embedded: Single binary, no external DB or API required
- Semantic search: 256-1024d vector search via Model2Vec + sqlite-vec KNN, hybrid RRF fusion with FTS5 BM25
- Knowledge graph: SPO triple store (petgraph + SQLite), incremental codebase scan via ast-grep (16 languages) with per-file hash cache, force re-scan support
- Cross-platform: Linux, macOS, Windows — all with native hooks (bash/PowerShell)
- MCP protocol: 2024-11-05, 2025-06-18, 2025-11-25 with version negotiation
- Hooks: 9 Codex events + 2 Claude Code events, all embedded at compile time
- Model distillation: 9 sentence-transformers models distillable to Model2Vec (10-497MB)
- 8 agents supported: Codex CLI, Claude Code, Cursor, Zed, OpenCode, Windsurf, Goose, Gemini CLI
Codex / Claude Code / Cursor / Zed / ...
│
▼
agentrete MCP (HTTP :9092 or stdio)
│
├── Memory Engine ── SQLite + FTS5 + vec0 KNN + model2vec
│ rules/decisions/patterns/bugs (semantic search)
│
└── Knowledge Graph ── SQLite kg_triples + petgraph (in-memory)
code scan via ast-grep (16 languages)
kg_query / kg_scan (with optional watch)
- Session auto-create — new session recorded on MCP
initialize - Observations —
memory_save/memory_searchcalls auto-logged to observations table - stop.sh hook — reminds AI to save key decisions every 10th session stop
- Save — text + metadata → SQLite (embedding=NULL, embed worker picks up async)
- Embed — background worker batches pending rows → Model2Vec/remote API → writes embedding + vec0 index
- Search — query → vec0 KNN + FTS5 BM25 concurrent → RRF fusion → temporal decay → ranked results
- Forget — hard delete (row + vec0 entry)
- Scan —
agentrete scan .orkg_scanMCP → ast-grep scans codebase → SPO triples stored in SQLite - Watch —
kg_scanwithwatch: true→ automatic re-scan on file changes (incremental via hash cache) - Query —
kg_query→ petgraph in-memory traversal (neighbors, shortest path, filtered by predicate/direction/project)
Configuration via ~/.agentrete/config.toml (TOML/YAML/JSON, env override with AGENTRETE__* prefix):
port = 9092
[embedding]
backend = "model2vec" # "none" | "model2vec" | "remote"
[embedding.model2vec]
model = "minilm-256d"
dims = 256
# [embedding.remote]
# url = "http://localhost:11434"
# model = "qwen3-embedding:latest"See config-reference.toml for all options.