Lucid

Minimal memory runtime for AI agents. Hindsight's memory runtime stripped from 239,000 lines to 2,000.

Lucid is a memory runtime for AI agents. It extracts facts from conversations, stores them in a SQLite knowledge base, retrieves the most relevant ones on demand, and synthesises answers from memory using multi-turn reasoning.

Three operations. That's the whole API:

await retain(text, ...)   # extract + store facts
await recall(query, ...)  # retrieve relevant facts
await reflect(query, ...) # synthesise an answer from memory

No Postgres. No Redis. No Kafka. No Docker. Just SQLite and an LLM.

Architecture

flowchart LR
    I["Conversation / text"] --> T["retain()"]
    T --> X["LLM fact extraction"]
    X --> S["SQLiteMemoryStore"]
    Q["Query"] --> R["recall()"]
    S --> R
    E["Embedder / reranker"] --> R
    R --> F["Relevant facts"]
    F --> Y["reflect()"]
    Y --> A["Answer from memory"]

Lucid keeps three operations: retain facts, recall the most relevant memory, then reflect over that memory to answer.

Size comparison

Component	Hindsight	Lucid	Reduction
API surface (FastAPI)	~8,000 LOC	—	100%
Storage backends (Postgres, Redis, S3)	~15,000 LOC	~430 LOC (SQLite)	97%
Cloud control plane	~20,000 LOC	—	100%
Ingestion pipelines	~12,000 LOC	479 LOC	96%
Retrieval engine	~10,000 LOC	317 LOC	97%
Synthesis layer	~8,000 LOC	310 LOC	96%
Total	~239,000 LOC	~2,000 LOC	99%

What was cut: REST API, database migrations, multi-tenancy, cloud infrastructure, Kafka ingestion, S3 storage, Postgres/Redis backends, authentication, billing, admin dashboards.

What remains: the memory logic.

Install

# No pip package yet — copy the folder directly
cp -r lucid/ your-project/lucid/

# Dependencies
pip install httpx  # for the built-in LLM client

lucid/
├── types.py      # Data model (Fact, Entity, Bank, Budget, etc.)
├── interfaces.py # Protocol definitions (LLMClient, Embedder, MemoryStore, Reranker)
├── retain.py     # Fact extraction + storage pipeline
├── recall.py     # 4-strategy retrieval with RRF fusion
├── reflect.py    # Multi-turn synthesis over memory
└── store.py      # SQLite-backed MemoryStore

Quick start

import asyncio
from lucid import retain, recall, reflect, SQLiteMemoryStore, Budget

# Minimal implementations of the protocol interfaces
class MyLLM:
    async def complete(self, messages, tools=None, **kwargs):
        # Use OpenAI, Ollama, Anthropic, etc.
        ...

class MyEmbedder:
    async def embed(self, text: str) -> list[float]:
        # Use nomic-embed-text, text-embedding-3-small, etc.
        ...

async def main():
    store = SQLiteMemoryStore("memory.db")
    llm = MyLLM()
    embedder = MyEmbedder()
    bank_id = "user-alice"  # partition per user

    # --- RETAIN ---
    # Extract facts from a conversation turn and store them
    result = await retain(
        text="Alice mentioned she's allergic to shellfish and prefers morning meetings.",
        store=store,
        llm=llm,
        embedder=embedder,
        bank_id=bank_id,
    )
    print(f"Stored {len(result.facts)} facts, resolved {len(result.entities)} entities")

    # --- RECALL ---
    # Retrieve the most relevant facts for a query
    results = await recall(
        query="What are Alice's dietary restrictions?",
        store=store,
        embedder=embedder,
        bank_id=bank_id,
        budget=Budget(max_facts=5, max_tokens=800),
    )
    for fact in results.facts:
        print(f"  [{fact.score:.2f}] {fact.text}")

    # --- REFLECT ---
    # Synthesise an answer from memory (multi-turn reasoning)
    answer = await reflect(
        query="What should I know before scheduling a lunch with Alice?",
        store=store,
        llm=llm,
        embedder=embedder,
        bank_id=bank_id,
    )
    print(answer.response)

asyncio.run(main())

How it works

retain()

Extracts structured facts from free text using an LLM, resolves entities, embeds facts for vector search, and persists everything:

input text
    ↓
LLM extracts Fact objects (text, kind, type, entities, temporal markers)
    ↓
Resolve entities: match against existing entities in store
    ↓
Link facts to resolved entities (cross-retain continuity)
    ↓
Embed fact text for vector retrieval
    ↓
Save Facts, Entities, and MemoryLinks to SQLite

Fact kinds:

EPISODIC — things that happened ("Alice visited Paris")
SEMANTIC — things that are true ("Alice is allergic to shellfish")
PROCEDURAL — how to do something ("Alice prefers email over Slack")

Fact types:

preference, belief, fact, instruction, event, identity, relationship

recall()

Four retrieval strategies fused with Reciprocal Rank Fusion (RRF):

Strategy	What it does
Semantic vector search	Cosine similarity on embedded fact text
Keyword search	BM25-style text matching
Entity lookup	Facts linked to entities matching query
Recency boost	Recent facts ranked higher

Results are filtered by a Budget (max facts, max tokens) to keep context windows manageable.

reflect()

Multi-turn agentic synthesis: the LLM reasons over recalled facts to answer a question, calling back into memory for follow-up retrievals when needed:

query
  ↓
recall() → top-k facts
  ↓
LLM synthesises answer (with optional tool call: search_memory)
  ↓
If LLM calls search_memory → recall() again with new query
  ↓
Continue until answer is complete or max_turns reached
  ↓
Return ReflectResult(response, facts_used, turns)

Data model

@dataclass
class Fact:
    id: str
    bank_id: str          # partition key
    text: str             # the fact as a natural language sentence
    kind: FactKind        # EPISODIC | SEMANTIC | PROCEDURAL
    fact_type: FactType   # preference | belief | event | instruction | ...
    embedding: list[float]
    created_at: str
    entity_ids: list[str]

@dataclass
class Entity:
    id: str
    bank_id: str
    name: str
    entity_type: EntityType   # person | place | organization | concept | other
    summary: str
    fact_ids: list[str]

@dataclass
class Budget:
    max_facts: int = 10
    max_tokens: int = 2000

SQLite schema

facts        -- extracted facts with embeddings
entities     -- resolved named entities
memory_links -- fact ↔ entity associations
observations -- raw input log (provenance)

WAL mode enabled. Vector similarity computed in Python — suitable for up to ~50k facts. Swap the MemoryStore protocol for a vector DB backend when you need more scale.

Bring your own components

Every component is a protocol — swap in your preferred implementation:

from lucid.interfaces import Embedder, Reranker, MemoryStore

class ChromaStore(MemoryStore):
    """Use ChromaDB as the backend instead of SQLite."""
    async def save_fact(self, fact): ...
    async def search_by_embedding(self, embedding, bank_id, limit): ...
    # ... implement the other methods

class CohereReranker(Reranker):
    """Re-rank results with Cohere's rerank endpoint."""
    async def rerank(self, query, facts, limit): ...

Tests

# Unit tests (no LLM required)
python3 -m pytest tests/ -q --ignore=tests/test_e2e.py

# End-to-end test (requires LLM + embedder)
OPENAI_API_KEY=sk-... python3 -m pytest tests/test_e2e.py -v -s

113 tests covering fact extraction, entity resolution, recall strategies, RRF fusion, synthesis, SQLite store.

What was removed from Hindsight

Lucid is a targeted extraction of Hindsight's memory kernel:

REST API (FastAPI + auth + rate limiting) — not included
Postgres + Redis backends — replaced with a single SQLite store
S3 / object storage — removed
Kafka ingestion pipeline — removed
Multi-tenancy + billing — removed (use bank_id for partitioning)
Cloud control plane — removed
Helm charts / Docker Compose — removed
TypeScript client SDK — removed (Python only)

The fact extraction logic, entity resolution, four-strategy retrieval, RRF fusion, and multi-turn reflection are preserved.

Part of a suite

Lucid pairs naturally with:

NanoLetta — cognitive agent loop (Letta → 1.9k LOC). Wire Lucid in via the memory tool interface.
Temporal — temporal knowledge graph (Graphiti → 2.8k LOC). Knows when facts changed.

Requirements

Python 3.11+
httpx (for the built-in LLM client — omit if you bring your own)
Any OpenAI-compatible LLM endpoint
Any embedding function returning list[float]

License

MIT. See LICENSE.

Acknowledgments

The memory architecture, fact taxonomy, and retrieval design are inspired by Hindsight by Vectorize AI (MIT). Lucid is an independent extraction — not affiliated with Vectorize AI.

Extracted from Hindsight's memory-runtime design. Not tracking upstream; this is a stable standalone extraction, not a rolling fork.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
docs		docs
lucid		lucid
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lucid

Architecture

Size comparison

Install

Quick start

How it works

retain()

recall()

reflect()

Data model

SQLite schema

Bring your own components

Tests

What was removed from Hindsight

Part of a suite

Requirements

License

Acknowledgments

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Lucid

Architecture

Size comparison

Install

Quick start

How it works

retain()

recall()

reflect()

Data model

SQLite schema

Bring your own components

Tests

What was removed from Hindsight

Part of a suite

Requirements

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages