feat: support deferred embedding generation#12
Open
andinux wants to merge 2 commits into
Open
Conversation
Add a defer_embeddings option that stores content without computing
embeddings or FTS entries, so callers (e.g. a dashboard upload) can add
files instantly without an embedding model and index them later from a
background process.
- memory_set_option('defer_embeddings', 1): memory_add_* functions only
store content in dbmem_content; requires save_content=1
- memory_embed_pending([limit]): embeds pending rows in batches, one
SAVEPOINT per file, so an interrupted worker can be safely retried;
rekeys rows whose stored hash no longer matches the current
preserve_duplicate_paths scope
- memory_pending_count(): number of rows awaiting embeddings, for
progress reporting
- memory_list_files(): file nodes now include an "indexed" boolean
- content parsing to zero chunks (e.g. whitespace-only) now inserts a
zero-length sentinel row in dbmem_vault marking it processed, so it
exits the pending state and memory_reindex stops re-parsing it
(sqlite-vector >= 0.9.80 skips undersized blobs during scans)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
A parse producing no chunks (e.g. whitespace-only content) reached the dimension persistence block with ctx->dimension still 0, writing dimension=0 to dbmem_settings and latching dimension_saved. Later real embeddings then updated the dimension in memory only, so a reopened connection that only searches saw dimension=0 and reported that no content has been indexed. Persist the dimension only when at least one real embedding was computed (chunks_added > 0), which guarantees ctx->dimension is set. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
marcobambini
approved these changes
Jun 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
UI applications (e.g. the SQLite Cloud dashboard) need uploads to appear instantly. Today every
memory_add_*call computes embeddings inline, so adding content blocks on the embedding engine — and even requires a configured model just to store a file. This PR decouples the two steps: store content immediately, generate embeddings later from a background process, and let the UI track progress.Implementation choices
No schema change. "Pending" is a derived state: a
dbmem_contentrow with a non-empty storedvalueand nodbmem_vaultrows. We deliberately avoided adding a status/n_chunkscolumn todbmem_contentbecause that table is the cloudsync-replicated one (cloudsync_init('dbmem_content')), while "has this content been embedded" is per-node local state — a replicated column would propagate node A's "embedded" claim to node B whose local vault is empty, and would require a coordinated schema migration across replicas. The local-onlydbmem_vaulttable is the correct home for this state, andmemory_reindex()'s existinghas_vaultcheck already understands it.Deferred add is a skip, not a new pipeline. With
defer_embeddings=1,dbmem_process_bufferperforms everything it does today (dedup, stale-path cleanup, content insert, SAVEPOINT) and only skips the chunk/embed/FTS step.memory_embed_pending()reuses the same row-processing logic asmemory_reindex()(one SAVEPOINT per file, hash-consistency rekeying), so a file is always either fully indexed or untouched, and an interrupted worker can simply be restarted.New API surface
memory_set_option('defer_embeddings', 1)— allmemory_add_*functions store content without computing embeddings or FTS entries. No embedding model is required for deferred adds. Requiressave_content=1(rejected otherwise, since the content could never be embedded later). Deferred content is invisible tomemory_searchuntil embedded.memory_embed_pending([limit])— embeds up tolimitpending rows (all when omitted), returns the number processed. Designed for background workers looping with a small batch size.memory_pending_count()— number of rows still waiting for embeddings, for progress reporting.memory_list_files()— file nodes now include"indexed": true|falseso a UI can render per-file badges. Empty files and directory markers always reporttrue.Zero-chunk contents
Content whose parsing yields no chunks (e.g. whitespace-only text) now inserts a single sentinel row in
dbmem_vault(zeroblob(0)embedding,n_tokens=0) marking it processed. Without it, such rows would look pending forever and the worker loop would never converge. sqlite-vector ≥ 0.9.80 skips NULL/undersized blobs in all scan paths, so search is unaffected. Side fix:memory_reindex()stops re-parsing these rows on every run, and a zero-chunk first add no longer persistsdimension=0to settings (which previously brokememory_searchon reopened connections).UI flow
Validations
Done:
make test DEFINES="-DTEST_SQLITE_EXTENSION")defer_embeddings+save_content=0is rejected;memory_embed_pendingbatches correctly and converges to 0; sentinel row created for zero-chunk content (direct and deferred) with no embedding computed;indexedflag inmemory_list_files(); zero-chunk first add does not persistdimension=0list_filesassertions gained"indexed", whitespace-only test now expects the sentinel)dist/memory.dylib)Todo:
vector_full_scancheck with a sentinel row present, using a real sqlite-vector build (the unit-test harness does not load sqlite-vector) — expected: sentinel skipped, no error🤖 Generated with Claude Code