feat: scheduled ingest with deduplication#97
Open
Shriiii01 wants to merge 108 commits intoekailabs:mainfrom
Open
feat: scheduled ingest with deduplication#97Shriiii01 wants to merge 108 commits intoekailabs:mainfrom
Shriiii01 wants to merge 108 commits intoekailabs:mainfrom
Conversation
Contributor
Shriiii01
commented
Mar 3, 2026
- Pass deduplicate through /v1/ingest to store.ingest() and Memory.add()
- Add conversation log (JSONL) on each /v1/chat/completions, no live ingest
- Add scripts/ingest-from-log.mjs: checkpoint-based, rate-limited 1 req/s
- Config: CONVERSATION_LOG_PATH, INGEST_CHECKPOINT_PATH, MEMORY_INGEST_URL
- Add OllamaProvider class with OpenAI-compatible API support - Register Ollama in ProviderRegistry with model selection rules - Add Ollama configuration to AppConfig (baseUrl, apiKey, enabled) - Add Ollama to chat_completions_providers_v1.json catalog with 16 popular models - Add ollama.yaml pricing file (free/local models) - Update ProviderName type to include 'ollama' - Add OLLAMA_BASE_URL and OLLAMA_API_KEY to .env.example Ollama runs models locally and exposes an OpenAI-compatible API at http://localhost:11434/v1 by default. Users can configure a custom base URL via OLLAMA_BASE_URL environment variable.
- Added Ollama to responses_providers_v1.json catalog - Created OllamaResponsesPassthrough class implementing Responses API - Registered Ollama in responses-passthrough-registry.ts Ollama supports the OpenResponses API specification at /v1/responses endpoint, providing future-proof support as /chat/completions may be deprecated.
…ider feat: Add Ollama provider support
# Conflicts: # gateway/src/infrastructure/config/app-config.ts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ntegration Add OpenRouter integration with unified launcher and Docker fixes
Changed 'wait -n' to 'wait' to keep the container running indefinitely instead of exiting when the first service exits. This allows all services (gateway, dashboard, memory, openrouter) to continue running. Fixes issue where container would restart repeatedly with exit code 0.
- Document all 4 services and their ports (gateway, dashboard, memory, openrouter) - Add Docker service control via ENABLE_* environment variables - Clarify OpenRouter integration service runs on port 4010 (not 4006) - Add Docker Compose section with service management instructions - Document Docker restart behavior and service lifecycle
…ntegration Fix Docker fullstack entrypoint and document service configuration
Changed video URL from hZC1Y_dWdhI to sLg9YmYtg64 in: - README.md - docs/ROFL_DEPLOYMENT.md Uses GitBook-compatible embed syntax.
…ntegration Update YouTube demo video URLs and GitBook embeds
The sectorColors/sectorDescriptions maps and MemorySectorSummary type only had 3 sectors (episodic, semantic, procedural) but the memory service also returns reflective memories, causing an undefined .includes() crash in MemoryStrength.
Agents can now set a relevancePrompt that gates ingest — an LLM checks if incoming content matches the agent's scope before extraction/embedding. Irrelevant content is rejected early with a reason. Adds GET/PUT single agent endpoints and updates README with new API docs and flow diagram.
Replace silent INSERT OR IGNORE with a strict existence check that throws agent_not_found. Only the default agent is auto-created at init via a private upsertDefaultAgent(). Add routeError helper to router so all catch blocks return 404 on agent_not_found instead of 500.
- New /agents page: agent cards with soul/relevance prompts, per-agent stats (users, episodic, semantic, procedural), edit/create/delete modals - api.ts: add soulMd + relevancePrompt to getAgents() type; add createAgent() and updateAgent() calls - layout.tsx: add global top nav with Memory Vault and Agents links - memory/page.tsx: offset sticky header to top-11 to clear global nav
Dedup, relevance gate, and agents dashboard
Generate standalone package-lock.json files for memory and integrations/openrouter, and switch all runtime stages from npm install --omit=dev to npm ci --omit=dev. This eliminates npm registry calls during Docker builds, fixing intermittent 403 rate-limit failures in CI.
Add sqlite-vec extension to @ekai/memory, replacing the 200-row linear scan + JS cosine similarity with proper ANN indexing via vec0 virtual tables with cosine distance metric. - Add sqlite-vec dependency, load extension on construction - Create vec0 virtual tables (memory_vec, procedural_vec, semantic_vec, reflective_vec) lazily on first embedding - Insert into vec tables alongside main tables on write path - Replace getCandidatesForSector 200-row scan with two-step KNN: ANN query on vec table, then filter via main table - Replace findDuplicate linear scans with vec KNN queries - Update scoreRowPBWM to accept precomputed similarity - Make embedding optional on record types (query results no longer carry full embedding arrays) - Stop selecting embedding in semantic graph traversal queries - Clean up vec tables on delete operations
Embedding is always present on write and never read on query path (similarity is precomputed by sqlite-vec). Graph traversal methods now return Omit<SemanticMemoryRecord, 'embedding'> since they are structural queries that don't select the embedding column.
Add sqlite-vec ANN vector search to @ekai/memory
The standalone memory/package-lock.json was stale after sqlite-vec was added to package.json, causing npm ci to fail in the Docker build.
Regenerate memory lockfile to include sqlite-vec dependency
Lifecycle event logger that registers all 13 OpenClaw hooks via api.registerHook() and appends JSONL entries with safe serialization. Published to npm as @ekai/contexto.
Extract event storage from openclaw plugin into @ekai/store workspace with EventWriter (normalization, safe serialization, per-session JSONL files) and EventReader (session listing, reconstruction with tool call pairing and userId attribution). Includes path-traversal protection, chronological event ordering, and 48 tests.
- Fix durationMs: 0 being overwritten by computed value (falsy check → else-if) - Sync runtime configSchema with manifest (declare dataDir property) - Reorder root build: install first for clean-env safety, store before dependents - Remove redundant double-resolve in reconstructSession - Fix misleading test name for raw ID storage behavior
- Simplify rawAgentId/rawSessionId storage: remove dead !== check since sanitizeId always appends a hash suffix, just check if input is present - Add _error optional field to StoreEvent and AppendInput interfaces to reflect the serialization-failure fallback that appears in JSONL output
Add OpenClaw plugin and JSONL event store
- Pass deduplicate through /v1/ingest to store.ingest() and Memory.add() - Add conversation log (JSONL) on each /v1/chat/completions, no live ingest - Add scripts/ingest-from-log.mjs: checkpoint-based, rate-limited 1 req/s - Config: CONVERSATION_LOG_PATH, INGEST_CHECKPOINT_PATH, MEMORY_INGEST_URL
Contributor
Author
|
Fixes #90 Happy to adjust based on feedback. |
link to new work https://github.com/ekailabs/contexto
link to new work https://github.com/ekailabs/contexto
- Resolve README.md: keep archived notice and project description
- Resolve README.md and .env.example - Keep scheduled ingest env vars, drop gateway from merge result
99e109b to
dd08087
Compare
Made-with: Cursor
Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.