feat: support deferred embedding generation by andinux · Pull Request #12 · sqliteai/sqlite-memory

andinux · 2026-06-10T05:12:25Z

Why

UI applications (e.g. the SQLite Cloud dashboard) need uploads to appear instantly. Today every memory_add_* call computes embeddings inline, so adding content blocks on the embedding engine — and even requires a configured model just to store a file. This PR decouples the two steps: store content immediately, generate embeddings later from a background process, and let the UI track progress.

Implementation choices

No schema change. "Pending" is a derived state: a dbmem_content row with a non-empty stored value and no dbmem_vault rows. We deliberately avoided adding a status/n_chunks column to dbmem_content because that table is the cloudsync-replicated one (cloudsync_init('dbmem_content')), while "has this content been embedded" is per-node local state — a replicated column would propagate node A's "embedded" claim to node B whose local vault is empty, and would require a coordinated schema migration across replicas. The local-only dbmem_vault table is the correct home for this state, and memory_reindex()'s existing has_vault check already understands it.

Deferred add is a skip, not a new pipeline. With defer_embeddings=1, dbmem_process_buffer performs everything it does today (dedup, stale-path cleanup, content insert, SAVEPOINT) and only skips the chunk/embed/FTS step. memory_embed_pending() reuses the same row-processing logic as memory_reindex() (one SAVEPOINT per file, hash-consistency rekeying), so a file is always either fully indexed or untouched, and an interrupted worker can simply be restarted.

New API surface

memory_set_option('defer_embeddings', 1) — all memory_add_* functions store content without computing embeddings or FTS entries. No embedding model is required for deferred adds. Requires save_content=1 (rejected otherwise, since the content could never be embedded later). Deferred content is invisible to memory_search until embedded.
memory_embed_pending([limit]) — embeds up to limit pending rows (all when omitted), returns the number processed. Designed for background workers looping with a small batch size.
memory_pending_count() — number of rows still waiting for embeddings, for progress reporting.
memory_list_files() — file nodes now include "indexed": true|false so a UI can render per-file badges. Empty files and directory markers always report true.

Zero-chunk contents

Content whose parsing yields no chunks (e.g. whitespace-only text) now inserts a single sentinel row in dbmem_vault (zeroblob(0) embedding, n_tokens=0) marking it processed. Without it, such rows would look pending forever and the worker loop would never converge. sqlite-vector ≥ 0.9.80 skips NULL/undersized blobs in all scan paths, so search is unaffected. Side fix: memory_reindex() stops re-parsing these rows on every run, and a zero-chunk first add no longer persists dimension=0 to settings (which previously broke memory_search on reopened connections).

UI flow

-- once per database
SELECT memory_set_option('defer_embeddings', 1);

-- on upload (instant, no model needed)
SELECT memory_add_content('docs/api.md', :content);
SELECT memory_list_files();          -- file appears with "indexed":false

-- background worker
SELECT memory_embed_pending(10);     -- repeat until it returns 0

-- UI polling
SELECT memory_pending_count();       -- progress = 1 - pending/total; refresh badges via memory_list_files()

Validations

Done:

Full unit test suite passes: 181/181 (make test DEFINES="-DTEST_SQLITE_EXTENSION")
New tests: deferred add stores content with no vault/FTS rows and needs no model; defer_embeddings + save_content=0 is rejected; memory_embed_pending batches correctly and converges to 0; sentinel row created for zero-chunk content (direct and deferred) with no embedding computed; indexed flag in memory_list_files(); zero-chunk first add does not persist dimension=0
Existing tests updated for the intended behavior changes (8 exact-JSON list_files assertions gained "indexed", whitespace-only test now expects the sentinel)
Extension builds cleanly (dist/memory.dylib)

Todo:

Manual vector_full_scan check with a sentinel row present, using a real sqlite-vector build (the unit-test harness does not load sqlite-vector) — expected: sentinel skipped, no error
End-to-end search-on-reopen check in the integration/sync test setup where the full extension stack is loaded
Confirm deployed sqlite-vector version is ≥ 0.9.80 everywhere the dashboard ships this feature

🤖 Generated with Claude Code

Add a defer_embeddings option that stores content without computing embeddings or FTS entries, so callers (e.g. a dashboard upload) can add files instantly without an embedding model and index them later from a background process. - memory_set_option('defer_embeddings', 1): memory_add_* functions only store content in dbmem_content; requires save_content=1 - memory_embed_pending([limit]): embeds pending rows in batches, one SAVEPOINT per file, so an interrupted worker can be safely retried; rekeys rows whose stored hash no longer matches the current preserve_duplicate_paths scope - memory_pending_count(): number of rows awaiting embeddings, for progress reporting - memory_list_files(): file nodes now include an "indexed" boolean - content parsing to zero chunks (e.g. whitespace-only) now inserts a zero-length sentinel row in dbmem_vault marking it processed, so it exits the pending state and memory_reindex stops re-parsing it (sqlite-vector >= 0.9.80 skips undersized blobs during scans) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

A parse producing no chunks (e.g. whitespace-only content) reached the dimension persistence block with ctx->dimension still 0, writing dimension=0 to dbmem_settings and latching dimension_saved. Later real embeddings then updated the dimension in memory only, so a reopened connection that only searches saw dimension=0 and reported that no content has been indexed. Persist the dimension only when at least one real embedding was computed (chunks_added > 0), which guarantees ctx->dimension is set. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

andinux and others added 2 commits June 9, 2026 22:45

andinux requested a review from marcobambini June 10, 2026 05:13

marcobambini approved these changes Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support deferred embedding generation#12

feat: support deferred embedding generation#12
andinux wants to merge 2 commits into
mainfrom
feat/deferred-embeddings

andinux commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andinux commented Jun 10, 2026

Why

Implementation choices

New API surface

Zero-chunk contents

UI flow

Validations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants