0.8.0 — Slice 5 (G1 structured SearchHit + FTS5 tokenizer) + Slices 0 close + HITL decisions#83
Merged
Conversation
HITL-approved 0.8.0 implementation plan and its supporting corpus, committed to give the Slice 0 agent a clean baseline (cold-start §12.2 + worktree baseline §1 both need these on main). Corpus: - dev/plans/0.8.0-implementation.md — the approved 9-slice plan (mod-5 numbering + reserved gaps; slice-orchestrator model; per-slice design→TDD→codex→fix-N discipline; cross-cutting X1 SDK parity+functional harnesses / X2 mkdocs build / X3 per-slice docs + dev/DOC-INDEX.md). - dev/design/0.8.0-agent-memory-fit.md — gap ladder G0–G12 + consumer fit (§9). - dev/design/0.8.0-v05-feature-triage.md — v0.5.x add/defer/drop triage. - dev/design/agent-memory-impl-strategy.md — per-gap leverage build plan. - dev/adr/ADR-0.8.0-supersede-five-verb-surface-cap.md (new) + ADR-0.8.0-agent-memory-retrieval-and-identity.md (scope reclass). - dev/roadmap/0.8.0.md — v0.5.x revival scope. - dev/profiling/ — ingest+retrieval stack profiling component guide. - dev/plans/prompts/0.8.0-PROF-*.md — ingest/retrieval profiler slice prompts. - dev/plans/runs/*.wf.js — the planning workflow scripts (resumable). Slice 0 (separate agent) executes next; this commit does not start execution. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…concile mkdocs nav, author substrate ADR, advance supersession ADR; advance pointer to Slice 5 Slice 0 [design-adr], no worktree, no code. Subagents 0.a ∥ 0.b (both spawned as subagent_type: implementer, no fallback) returned PASS against their bars. - 0.a: NEW dev/adr/ADR-0.8.0-canonical-identity-substrate.md — four substrate decisions settled (additive column shape; invalidate-not-delete; edges carry temporal cols = Q4 yes; op-store cascade); verbatim Slice-15 schema delta (logical_id+superseded_at on canonical_nodes AND canonical_edges; partial unique-active index; folded G4/G5 indexes; MIGRATION-ACCRETION-EXEMPTION; SCHEMA_VERSION 10->11); in-place additive migration policy; write_cursor-as- row-id deviation FLAGGED for HITL; shadow vec0/FTS5 reconciliation named as reserved Slice 16. - 0.b: advanced ADR-0.8.0-supersede-five-verb-surface-cap.md -> decision-ready (Q1-Q5 = A1/B1/amend/confirm/SDK-only; conformance rewrite enumerated not executed; three guarantees carried forward); authored 0.8.0-plan.md (mod-5 ladder) + STATUS-0.8.0.md (nine §12.5 sections + X1/X2/X3 column + witness + harness contract); created dev/DOC-INDEX.md (X3); reconciled mkdocs nav (added 0.6.1; 0.8.0 stub) + mkdocs build --strict green (X2). Adversarial review PASS. codex (--sandbox read-only) unrunnable here (bubblewrap net-namespace init failure; relaxation flags denied by harness classifier); substituted an independent adversarial subagent on the identical four-check + Slice-15 rubric. Verdict+provenance: dev/plans/runs/0.8.0-slice-0-review-20260602T115112Z.md (raw codex failure log alongside, .log). Witness: implementer subagent-type EXISTS + selectable (supersedes stale MEMORY orchestration-execution-traps note). Terminal Slice-0 exit = HITL gate-package sign-off (substrate gates Slice 15; supersession finalized at Slice 25). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
….0 ADRs in index
Recovery denylist (HITL decision 2026-06-02, 'five everywhere'): correct prose to
{recover,restore,repair,fix,rebuild} across bindings.md, the supersede ADR (element-3
table, preserves-verdict, Q4, Slice-15 guarantee), interface-inventory (x2), and
0.8.0-v05-feature-triage (x2). doctor is SDK-absent via the positive verb allowlist,
NOT this recovery-name denylist. The five-name enforcement artifacts
(test_no_recovery_surface py/ts/rs + AC-035d) stay byte-unchanged; prose now matches.
ADR index: add a Phase 0.8.0 section (#33-36) registering all four 0.8.0 ADRs in
ADR-0.6.0-decision-index.md, which agents must consult before scanning the tree.
Both issues surfaced by codex review across multiple rounds; final codex pass clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…leteness audit codex completeness audit of Slice 0 (HEAD a42f234) rated all six substantive deliverables PRESENT/MEETS-BAR and flagged only a literal reading of the 'no worktree outstanding' criterion: raw `git worktree list` shows two worktrees while the board ledger said 'empty'. Neither is a 0.8.0 slice-managed worktree (Slice 0 is design-adr and created none): - .claude/worktrees/agent-ad59c9d7bcc049a3d — locked prior-agent harness orphan (0.6.x commit 0debd6b, live pid); - .claude/worktrees/corpus-work — active owner-managed corpus-expansion branch. Make the §6 ledger precise: it tracks 0.8.0 slice-managed worktrees (none outstanding); the two pre-existing non-slice worktrees are out of scope and not Slice 0's to clean up. No worktree touched; no code; docs-only. Slice 0 remains CLOSED — deliverables complete; this only corrects board accuracy. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rompt, record step-11 plan-adjustment + SearchHit/tokenizer compat-ledger ack Orchestrator bookkeeping (no code). Slice 5 (G1 structured SearchHit + global FTS5 tokenizer upgrade) advanced NEW→WORKTREE_CREATED: - Worktree /tmp/fdb-slice-5-20260602T215841Z @ branch slice-5-20260602T215841Z, baseline 944cbb4 (main HEAD; re-verified before `git worktree add`). - Self-contained 5.a implementer prompt: dev/plans/prompts/0.8.0-slice-5.md. - Plan-adjustment (§12.4): Slice 5's NEW tokenizer migration is step_id 11 and bumps SCHEMA_VERSION 10→11 (migrate requires contiguous step_id + open guard user_version<=SCHEMA_VERSION; witnessed max step=10). This re-numbers Slice 15 to step_id 12 / 11→12 (its "step 11 / 10→11" contract is now stale; reconciled at Slice 15 close). - Compat-ledger ack: breaking SearchHit data-class change + tokenizer recall shift accepted as documented 0.8.0 events (AC-057a-clean; no HITL sign-off needed). Board (STATUS-0.8.0.md): §1 current slice, §2 row ⏳, §6 worktree ledger, §7 decisions, §8 next-action resume loop updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…orktree; slice agent owns its worktree and merges to main HITL correction: the orchestrator must not create worktrees. The slice agent does implementation work on a worktree IT owns and merges its green work onto local main itself; the orchestrator works on main AFTER the merge (review → 5.b → close + advance pointer). Supersedes the orchestration.md main-thread-owns-worktree / cherry-pick mechanic for the 0.8.0 campaign. - Removed the worktree + branch I had erroneously created (/tmp/fdb-slice-5-20260602T215841Z, slice-5-20260602T215841Z); git worktree list clean. - Rewrote dev/plans/prompts/0.8.0-slice-5.md: §0 the slice agent creates its own worktree from live main HEAD; §5 it merges to main (no push) when green + output.json written; §7 output.json carries merged_to_main_sha. - Board: §1 state PROMPTED, §2 row, §6 ledger note (slice-agent-owned), §7 correction + PROMPTED entries, §8 next-action resume loop reworked to operate on main post-merge. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…TS5 tokenizer upgrade Design-first: SearchHit shape (id=write_cursor interim, per-branch score, branch tag), dedup-on-body + vector-first ordering preserved, NEW step-11 drop+recreate FTS5 tokenizer migration (SCHEMA_VERSION 10->11) with open-time re-tokenization, Py+TS parity, X1/X2/X3 plan, recall-floor guardrail. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…loor across migration pr_g1_search_hits.rs: SearchResult.results is Vec<SearchHit> (id==write_cursor, kind, body, finite score, branch); no Eq derive but PartialEq retained; dedup-on-body + vector-first ordering. pr_g1_tokenizer_recall.rs: recall floor >=0.90 on a DB migrated from SCHEMA_VERSION 10 (not just fresh), pinning the no-op-on-existing-DB failure mode RED. AC-G1-hit-shape, AC-G1-no-eq, AC-G1-dedup-order, AC-FTS-tokenizer-floor. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… upgrade + Py/TS parity
Engine: add SearchHit{id,kind,body,score,branch}; SearchResult.results ->
Vec<SearchHit>; drop Eq (f64 score); widen ReaderResponse + read_search_in_tx;
vector branch carries write_cursor+kind+vec_distance_l2 score, FTS branch
carries body+kind+write_cursor+bm25(); dedup-on-body + vector-first preserved.
Schema: NEW migration step 11 (drop+recreate search_index with tokenizer
'porter unicode61 remove_diacritics 2', accretion-exemption marker);
SCHEMA_VERSION 10->11. Engine re-tokenizes search_index from canonical source
rows on open across the step-11 boundary (projection-only, no source-record
migration) — fixes the no-op-on-existing-DB failure mode.
Bindings: fathomdb-py PySearchHit + PySearchResult.results parity + .pyi +
types.py/engine.py/__init__.py; fathomdb-napi SearchHit + SearchResult.results
parity + binding.ts NativeSearchHit + index.ts SearchHit + mapper.
Consumers: all .results readers (recall harnesses eu7/eu8, perf_gates,
projection_runtime, cursors, excise_source, fts5_injection_safety,
cursor_read_after_write) read hit.body; migration-step assertions expect step 11.
Recall floor 0.90 holds before AND after the tokenizer upgrade across the
migration (pr_g1_tokenizer_recall: before=1.000 v10, after=1.000 v11, delta 0).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…with cross-binding equivalence src/python/tests/test_functional_search.py + src/ts/tests/functional-search.test.ts: open a real engine, write a small corpus, search(), and assert the structured SearchHit shape end-to-end across the FFI (id/kind/body/score/branch present and typed) in both languages. Both read the SAME functional_search_fixture.json (single source of truth) and assert identical body sets per query -> cross-binding equivalence. Seed of the write->search->retrieve->admin harness later slices extend. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t-plan/DOC-INDEX rows X3: docs/reference/python-api.md + typescript-api.md document SearchResult.results: list[SearchHit]/SearchHit[] and the new SearchHit shape (id/kind/body/score/branch); new docs/guides/structured-search-hits.md usage example (Py + TS); dev/architecture.md records the structured-hit carrier + step-11 tokenizer default; dev/test-plan.md adds the SDK functional-harness tier (X1) Suite Map row; dev/DOC-INDEX.md rows updated for every touched doc. X2: mkdocs build --strict green; guides promoted to a nav section. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…r upgrade (step 11, SCHEMA_VERSION 10→11)
…h in the step-11/reproject window Reconstructs the durable post-crash artifact (user_version=11 + empty search_index) on a real on-disk DB and asserts recall recovers on reopen. Fails on current code: the boundary-crossing guard (before<11 && after>=11) sees before==11 and skips the reproject, leaving the FTS shadow empty forever (recall 0.000). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The step-11 migration commits user_version=11 with an EMPTY search_index in its own transaction; the reproject that repopulates the FTS shadow ran in a SEPARATE later transaction gated on crossing the step-11 boundary (before<11 && after>=11). A crash after step 11 commits but before the reproject commits left a durable v11 + empty index on which the next open saw before==11, skipped the reproject, and stranded the index empty forever (recall collapses to ~0). Gate repair on the ABSENCE of a durable completion marker (_fathomdb_open_state['search_index_tokenizer_reproject_complete']) written INSIDE the same transaction as the reindex DELETE+INSERT, instead of on the boundary crossing. Atomic + idempotent: a crash before commit rolls both back (no marker -> next open re-runs); a crash after finds the marker and skips. Projection-only; SCHEMA_VERSION unchanged (still 11); no new migration step. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The marker-absence gate runs on every v11 open. Synthetic/legacy DBs whose user_version was stamped to 11 without running our migrations lack the _fathomdb_open_state table (created in step 1), so the marker read raised 'no such table' and masked the downstream embedder-identity / dimension mismatch errors (AC-048 / AC-048b). Treat a missing _fathomdb_open_state as 'reproject complete / nothing to do' so those DBs fall through to the embedder-identity probe unchanged. A genuinely migrated (crash-affected) DB always has the table, so crash-repair is unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…le (codex P1; state/marker-gated, not boundary-gated)
… (PASS after codex BLOCK→fix-1); advance pointer to Slice 10 Slice 5 merged to main by the slice agent (initial c4ab615; fix-1 e76d68b final). codex §9 review (primary, runnable here) found one [P1] crash-safety bug — the tokenizer reindex was gated on the one-time 10→11 boundary crossing, so a crash after step 11 commits (user_version=11, empty search_index) but before reproject commits left the FTS index empty forever. BLOCK→fix-1 made the reindex crash-retryable + idempotent via an atomic completion marker; codex re-review of the fix-1 diff: PASS, no findings. Close (docs-only): - Slice 5 CLOSED blocks in 0.8.0-implementation.md + 0.8.0-plan.md; pointer → Slice 10. - STATUS board: §1 current slice → 10, §2 table (5 ✅ + X1/X2/X3), §3 scoreboard (G1 ✅; recall floor held 1.000/1.000 across the migration), §6 worktree ledger (slice-5 worktree REMOVED), §7 decision record, §8 next action. - Renumber notes added: Slice 5 consumed step_id 11 / SCHEMA_VERSION 11, so Slice 15 becomes step_id 12 / SCHEMA_VERSION 11→12 (impl-plan Slice 15 heading + canonical identity ADR AUTHORIZED-delta). - Closure output.json + both codex review verdicts (slice-diff + fix-1) committed. Carried to HITL (environment-only, not a code defect): agent-verify.sh STRICT=1 fails AC-037 netns-deny-egress (no rootless userns in this sandbox). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… fusion_mode knob from Slice 10; AC-037 → CI at Slice 40
HITL signed the retrieval ADR (2026-06-02), unblocking Slice 10:
- Q1 = Option 1A: G9 RRF + G10 filtered-KNN both table-stakes, ship in Slice 10
(G10 uses a CLOSED SearchFilter struct; filter-grammar DSL stays Slice 35).
- Q2 = Option 2A: substrate designed bi-temporal-aware, implement single-supersession only.
- Q3 = documented-only, NO knob: RRF is the unconditional new ranking; the fusion_mode /
legacy-union escape hatch is DROPPED ("do not carry the overhead"). The entire Slice 10
contract is reconciled — every fusion_mode mention removed or marked NOT-fusion_mode;
G12-recency now gated behind a dedicated recency flag; the compat event is documented-only.
- Q4 = edges too: canonical_edges carry logical_id+superseded_at (schema-only).
- Q5 = advisory: §8d capability ladder stays advisory input, not canonical.
ADR status → accepted; added a "## HITL decisions (2026-06-02)" block.
STATUS board §5: retrieval package SIGNED (Slice 10 gate cleared); substrate package
partially signed (Q2/Q4 done; op-store cascade + migration policy + write_cursor deviation
still open before Slice 15). §1/§8 mark the Slice 10 gate cleared.
AC-037 (no-egress) disposition: can't run on windchill3 (Ubuntu 24.04 AppArmor
apparmor_restrict_unprivileged_userns=1) and runs in NO CI workflow. Accept-by-reasoning for
Slice 5 (no networking added); wire scripts/agent-security.sh into CI on a userns-permissive
runner as a NEW Slice 40 gate (n).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t-Slice-5/fix-1); keep Slice 40 CI gate (n) HITL ran the gate on windchill3 with the AppArmor userns lockdown temporarily relaxed (kernel.apparmor_restrict_unprivileged_userns=0, restored to 1 after): AC-037 OK: all connect() syscalls were loopback / AF_UNIX / AF_NETLINK. This machine-confirms the merged Slice 5 + fix-1 code makes no network egress — upgrading AC-037 from accept-by-reasoning to confirmed. The one-time pass is point-in-time, so the continuous Slice 40 CI gate (n) is KEPT (wire agent-security.sh on a userns-permissive runner). Updated the Slice 5 close blocks (impl-plan + plan + board §1), board §5 disposition, and gate (n) note to reflect the confirmation. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…l campaign authoritative, preserve corpus artifacts Local main (0.8.0 Slices 0+5 campaign) is authoritative for all overlapping planning docs / ADRs / STATUS / code (resolved -X ours). Brings in origin's unique corpus-expansion artifacts (tests/corpus/*, dev/corpus-creation/*, dev/notes/0.8.x-corpus-*, corpus QA prompts, test_corpus_eval_qa.py). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-of-band, owner-managed) Closes the X3/Slice-40-gate-m doc-map gap for the corpus/eval files brought in by the origin/main 83f5156 integration. Marked owner-managed; owner curates. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
0.8.0 campaign — Slices 0 + 5, signed HITL decisions, corpus-line integration
Brings the local 0.8.0 campaign onto
main, and integrates the out-of-band corpus-work line.Slice 5 — G1 structured
SearchHit+ global FTS5 tokenizer upgrade (PASS after codex BLOCK→fix-1)SearchResult.results: Vec<SearchHit{id=write_cursor, kind, body, score:f64, branch}>;Eqdropped (compiles). Both branches emit structured hits (vector=vec_distance_l2, FTS=bm25()); dedup-on-body + vector-first preserved.step_id 11drop+recreate FTS5 tokenizer migration (accretion-exemption marker),SCHEMA_VERSION10→11; re-tokenization wired at open._fathomdb_open_state; codex re-review: PASS.Slice 0 — plan + STATUS board + DOC-INDEX + ADRs (design-adr, CLOSED)
HITL decisions recorded (2026-06-02)
Retrieval ADR: Q1=1A (G9 RRF + G10 filtered-KNN table-stakes), Q2=2A (substrate bi-temporal-aware, implement minimal), Q3=documented-only / NO
fusion_modeknob, Q4=edges-too, Q5=advisory. Slice 10 contract reconciled to drop the knob. Slice 15 renumbered tostep_id 12/SCHEMA_VERSION 11→12. AC-037 → CI at Slice 40 (gate n).Corpus-work integration
Merged origin/main
83f5156with local authoritative for all campaign docs/ADRs/STATUS/code; preserved origin's unique corpus/eval artifacts (tests/corpus/*,dev/corpus-creation/*,dev/notes/0.8.x-corpus-*, corpus QA prompts,test_corpus_eval_qa.py) + added DOC-INDEX rows.Verification
Engine + schema tests green (incl.
pr_g1_*+ crash-recovery); clippy clean;mkdocs build --strictgreen; Py/TS parity confirmed. Slices 5 closed; pointer → Slice 10 (gate-clear).🤖 Generated with Claude Code