A new context database for reasoning-driven retrieval via tree search.
Fast, context-aware retrieval at scale with up to 70% less token cost.
ConDB (Context Database) is a tree-structured context database that uses LLM-powered reasoning-based retrieval via tree search instead of vector similarity — no vector DB, no chunking. It accepts PageIndex-compatible document trees, ChatIndex conversation trees, filesystem trees, and custom hierarchical JSON — with no runtime dependency on either. The LLM reasons over the tree, like a human expert using a table of contents, to locate relevant content.
- Similarity ≠ relevance — vector search retrieves what looks similar, not what is truly relevant. Similar-looking chunks may differ in intent (low accuracy), while truly relevant information may be expressed in very different language and get missed entirely (low recall). True relevance requires reasoning
- Chunking breaks semantic continuity — documents must be split into fixed-size segments to fit embedding models, causing context fragmentation that destroys their natural structure and cross-section relationships
- Retrieval is blind to context — embedding models encode the query alone, ignoring conversational history, user intent, and other contextual signals
ConDB replaces this with reasoning-based tree search: the LLM performs node-level relevance classification over a hierarchical index, incorporating full context — making retrieval adaptive, explainable, and traceable.
- Fast tree search at scale — reasoning-driven tree search with block partitioning and parallel processing, supporting complex, context-aware retrieval over large hierarchical structures
- KV-cache native — the first database designed around LLM KV-cache reuse. By caching intermediate results during tree search, ConDB reduces token usage by up to 70% with no loss in accuracy. The same efficiency gains extend to memory systems for long-context reasoning at scale
- Unified long-context infrastructure — a single system for both static and dynamic long-context workloads
Structured, persistent knowledge — documents (via PageIndex), file systems, and codebases. Scalable retrieval within large, organized hierarchies.
Evolving, runtime context — agent memory, long conversations (via ChatIndex), and autoresearch. Systems can continuously update, retrieve, and reason over newly generated information.
- Hierarchical storage — document trees, chat trees, and custom hierarchical JSON in SQLite
- Multiple retrieval strategies — beam search for small trees, block retrieval for large documents
- Multi-provider LLM support — Anthropic (Claude) and OpenAI (GPT) out of the box
- Extensible — plug in custom storage backends, LLM providers, or retrieval strategies
pip install -r requirements.txtimport contextdb
# Open database
db = contextdb.open("my_docs.sqlite")
# Configure LLM
db.set_llm(provider="anthropic", model="claude-sonnet-4-6")
# Store a document tree
tree_id = db.store(document_tree_json, format="document")
# Query with LLM reasoning
result = db.query(tree_id, "What are the key findings?")
print(result.contents)from contextdb import ContextTree
def build_markdown_tree(path: str) -> dict:
...
ct = ContextTree("context.sqlite")
tree_id = ct.index_markdown_file("doc.md", tree_builder=build_markdown_tree)
# You can also generate a tree out of process and call:
# tree_id = ct.index_document_tree(document_tree_json)
ct.close()Create a .env file with your API keys:
ANTHROPIC_API_KEY=sk-...
OPENAI_API_KEY=sk-...
Model and provider settings live in contextdb/config/config.yaml:
llm:
provider: anthropic # anthropic or openai
model: claude-sonnet-4-6 # any model the provider supports
context_limit: 100000
max_concurrent: 10
retriever:
beam_size: 3
max_turns: 5Override at runtime with environment variables:
LLM_MODEL=claude-opus-4-6 python your_script.pyConDB automatically selects the best retrieval strategy based on tree size:
| Strategy | Best for | How it works |
|---|---|---|
| Beam | Small trees (< 50 nodes) |
LLM evaluates and selects promising branches at each depth level |
| Block | Large documents (50+ nodes) |
Splits tree into token-bounded blocks, LLM reasons over each block. KV-cache native — caches intermediate block results to cut token usage by up to 70% |
You can also specify a strategy explicitly:
result = db.query(tree_id, "question", strategy="block", beam_size=3)Current filesystem benchmark summary lives in bench/fs_block_beam_vertical.md.
Run setup: fs_query_order=prefix, beam_size=3, max_turns=10, 5 filesystem queries on context7 only.
| Retriever | Avg Time (s) | Avg LLM Calls | Hit@1 | Hit@10 | Total Cost (USD) |
|---|---|---|---|---|---|
| Block | 8.44 | 2.4 | 1.00 | 1.00 | 0.2166 |
| Vertical | 28.18 | 6.8 | 0.40 | 1.00 | 0.2900 |
| Beam | 18.36 | 4.8 | 0.60 | 1.00 | 0.2091 |
| Retriever | Avg Time (s) | Avg LLM Calls | Hit@1 | Hit@10 | Total Cost (USD) |
|---|---|---|---|---|---|
| Block | 8.42 | 3.4 | 1.00 | 1.00 | 0.0643 |
| Vertical | 20.78 | 7.0 | 0.40 | 0.80 | 0.1712 |
| Beam | 17.84 | 4.8 | 0.40 | 1.00 | 0.1335 |
Block is the best default: perfect Hit@1 across both models, lowest cost on Sonnet 4.6 (prompt caching cuts cost by ~60%), and fastest latency. Beam and Vertical are sensitive to model version — Block is the most robust choice.
These numbers are benchmark snapshots, not hard guarantees; exact cost and latency will vary with model choice, provider pricing, prompt-cache behavior, and corpus shape.
contextdb/
├── api/
│ ├── condb.py # ConDB — main entry point
│ └── context_tree.py # ContextTree — tree indexing + query API
├── core/
│ └── storage.py # TreeDB (SQLite), StorageProtocol
├── adapter/
│ └── base.py # DocumentTree, ChatIndex, Generic adapters
├── retriever/
│ ├── base.py # Retriever protocols
│ └── algorithm/ # Beam, Block retrieval strategies
├── llm.py # LLMClient (Anthropic, OpenAI)
├── config/ # YAML configs for retrievers
└── prompts/ # Jinja2 prompt templates
Custom Storage Backend
from contextdb import StorageProtocol
class MyStorage:
def get_node(self, tree_id, node_id): ...
def get_children(self, tree_id, node_id): ...
# implement StorageProtocol methods
ct = ContextTree(storage=MyStorage())Custom LLM Provider
from contextdb import LLMProtocol
class MyLLM:
def chat(self, messages, system="", tools=None):
return {"content": [...], "stop_reason": "..."}
ct = ContextTree("db.sqlite", llm=MyLLM())./run_tests.sh all- PageIndex — vectorless, reasoning-based RAG that builds hierarchical tree indexes from long documents
- ChatIndex — tree indexing for long conversations, enabling reasoning-based retrieval over chat histories
- AgentFS — filesystem for AI agents
Licensed under Apache 2.0.
© 2026 Vectify AI