Skip to content

[watcher] add VaultWatcher, scan_vault, and self-write suppression#20

Merged
Obsidian68 merged 1 commit intofeat/integrationfrom
feat/file-watcher
May 3, 2026
Merged

[watcher] add VaultWatcher, scan_vault, and self-write suppression#20
Obsidian68 merged 1 commit intofeat/integrationfrom
feat/file-watcher

Conversation

@Obsidian68
Copy link
Copy Markdown
Owner

Summary

  • VaultWatcher class — watches ENGRAM_VAULT_PATH/engram/ recursively via watchfiles.awatch, handles file create/modify/delete on .md files, skips non-.md and unparseable files, self-write suppression via thread-safe register_write/unregister_write with threading.Lock
  • scan_vault() function — incremental mtime-based vault scan, falls back to full reindex() on first start (when last_indexed is None), updates state file with current timestamp after scan
  • VaultStore watcher integrationset_watcher() method, _atomic_write() registers self-writes with 2 * watcher_debounce_ms delay
  • App lifecyclescan_vault() runs before watcher starts, VaultWatcher created and started as background task if watcher_enabled, shutdown handler stops watcher
  • Health endpoint"watcher" component: "healthy" | "disabled" | "unhealthy", disabled components excluded from overall health status
  • CLI state file_write_state() preserves existing last_indexed across restarts, _remove_state() preserves last_indexed instead of deleting the file
  • Config stubswatcher_enabled: bool = True, watcher_debounce_ms: int = 2000 (stubs; feat/v1.3-config provides real fields)
  • Path-to-memory-ID tracking_path_to_memory_id dict in VaultWatcher ensures correct index key for delete events when filename stem differs from frontmatter id
  • Version bump — 1.2.0 → 1.3.0

Known limitations

  • Watcher uses deprecated FastAPI on_event lifecycle hooks (lifespan refactor is future scope)
  • Blocking sync I/O in async watcher loop (file reads in _handle_add_or_update); acceptable for v1.3
  • Config fields are stubs — real implementation comes from feat/v1.3-config

Test plan

  • 311 tests pass, 90.37% coverage (≥90% threshold)
  • Ruff lint and format clean
  • Zero TODOs, FIXMEs, HACKs, XXXs
  • Zero dead code
  • Zero hardcoded secrets
  • Version consistency: init.py 1.3.0, health.py 1.3.0, pyproject.toml 1.3.0
  • VaultWatcher self-write suppression: register/unregister/is_self_write thread-safe
  • scan_vault incremental reindex with mtime comparison
  • scan_vault full reindex fallback when no last_indexed
  • State file last_indexed preservation across restarts
  • Health endpoint watcher component: healthy/disabled/unhealthy
  • Config defaults and env overrides for watcher_enabled/watcher_debounce_ms

🤖 Generated with Claude Code

Add file watcher that monitors ENGRAM_VAULT_PATH/engram/ for create,
modify, and delete events on .md files. Includes incremental mtime-based
vault scan on startup, self-write suppression to avoid re-indexing Engram's
own writes, and health endpoint watcher component reporting.

- VaultWatcher class using watchfiles.awatch with path-to-memory-ID tracking
- scan_vault() with mtime comparison and full reindex fallback
- VaultStore.set_watcher() and _atomic_write() self-write suppression
- _write_state() preserves last_indexed across restarts
- _remove_state() preserves last_indexed instead of deleting the file
- Health endpoint reports watcher: healthy|disabled|unhealthy
- Config stubs: watcher_enabled, watcher_debounce_ms
- Version bump to 1.3.0

311 tests pass, 90.37% coverage, ruff lint/format clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a67188bcc8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/engram/cli/cli.py
Comment on lines +54 to +55
if existing and "last_indexed" in existing:
state_file.write_text(json.dumps({"last_indexed": existing["last_indexed"]}))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Keep runtime keys when preserving state file

_remove_state() now rewrites the state file to only {"last_indexed": ...} when that field exists, but start() and stop() still unconditionally read state["pid"]. After any normal shutdown that preserves last_indexed, the next CLI invocation can crash with KeyError: 'pid' instead of starting/stopping cleanly, effectively blocking restart unless users manually delete the state file.

Useful? React with 👍 / 👎.

def _handle_delete(self, path: Path) -> None:
path_str = str(path)
with self._lock:
memory_id = self._path_to_memory_id.pop(path_str, path.stem)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Delete indexed memory by tracked ID after restart

_handle_delete() falls back to path.stem when _path_to_memory_id has no entry, but that map is only filled by _handle_add_or_update() and is empty on process start. If an existing file was indexed under a different frontmatter id (supported by this feature), deleting it before a modify/add event in this process removes the wrong key and leaves stale rows in the search index until a full reindex.

Useful? React with 👍 / 👎.

@Obsidian68 Obsidian68 merged commit a85a16a into feat/integration May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant