Skip to content

fix(desktop): stop out-of-band ancestor merges from poisoning the scrollback cursor#1483

Open
tlongwell-block wants to merge 3 commits into
fix/timeline-cap-evictionfrom
fix/scrollback-history-frontier
Open

fix(desktop): stop out-of-band ancestor merges from poisoning the scrollback cursor#1483
tlongwell-block wants to merge 3 commits into
fix/timeline-cap-evictionfrom
fix/scrollback-history-frontier

Conversation

@tlongwell-block

Copy link
Copy Markdown
Collaborator

Problem

Tyler's June 14 → June 9 scrollback skip in #buzz-bugs (reported in #buzz-gui-performance; screenshot shows June 12 mostly missing). Root-caused deterministically against the real 2,980-event dataset: ancestor-island cursor poisoning, on main long before #1473.

  1. useLoadMissingAncestors (by-id) and useThreadReplies (subtree) merge events days older than the contiguously loaded window into the channel cache — isolated "islands".
  2. runPageOlderPass anchored its until cursor on baseline[0].created_at, assuming contiguity. With a June 9 island in cache, the scroll-up fetch pages backward from June 9 — June 13/12/11/10 are never requested.
  3. Every later pass anchors on the even-older result, so the hole never heals; only cache eviction (leave-trim/gc/restart) resets it — hence the intermittent Cmd+R recovery.

Fix

Out-of-band state rides on the events themselves:

  • RelayEvent.nonContiguous (local-only) — set by the new mergeNonContiguousTimelineMessages, used by both out-of-band writers. Already-cached copies are never downgraded.
  • The pager anchors on oldestContiguousHistoryTimestamp — the oldest unmarked content event.
  • The dense-second keyset seed also skips marked events (an island at the boundary second is a point, not a fetched prefix).
  • Self-healing: when contiguous paging reaches an island, the history merge's last-copy-wins dedupe swaps in the unflagged copy. No sibling state to keep atomic with the cache — the invariant holds by construction.

Validation

  • Regression tests drive the real runPageOlderPass against a relay double: red on the old anchor, green on the frontier anchor; plus island-healing and all-islands-fallback cases.
  • End-to-end under node with the full #buzz-bugs dataset (repo's own merge/format/entries/day-group modules): before — June 11 missing entirely, June 12/13 only injected fragments; after — every day from the frontier renders, Tyler's five June 12 roots present, zero relay events in the fetched span absent from cache.
  • desktop: 1498 passed / 0 failed; pnpm check clean; tsc --noEmit clean.

Stacked on #1473 (fix/timeline-cap-eviction); composes with #1478 (StrictMode-safe trim). Fixes the fetch-cursor bug; #1473/#1478 fix the eviction bugs.

npub1qyvc0c5kl4gqv2fd97fsk46tu378sqgy35vc83rvgfwne90sel7s0ed67d and others added 3 commits July 2, 2026 19:51
…ollback cursor

The older-history pager anchored its until cursor on baseline[0] — the
oldest event in the channel cache — assuming the cache is contiguous.
It is not: useLoadMissingAncestors injects thread roots fetched by id,
and useThreadReplies injects whole reply subtrees, both of which can be
days older than the contiguously loaded window. With such an island in
cache, the next scroll-up fetch pages backward from the island and the
history between the island and the real frontier is never requested.
Every later pass anchors on the even-older result, so the hole never
heals — the June 14 → June 9 day-skip, with June 12 sometimes partially
present (the injected roots themselves).

Fix: out-of-band merges mark what they splice in (RelayEvent.nonContiguous,
local-only), and the pager anchors on the oldest UNmarked content event
(oldestContiguousHistoryTimestamp). The state rides on the event, so it
is atomic with the cache write, survives trims, and heals itself: when
contiguous paging reaches an island, the history merge's last-copy-wins
dedupe replaces the flagged copy with the fresh unflagged one. The
dense-second keyset seed skips flagged events for the same reason — an
island at the boundary second is a point, not a fetched prefix.

Regression tests drive the real pager (runPageOlderPass) against a relay
double: red on the previous anchor (island skips the gap days), green on
the frontier anchor, plus island healing and the all-islands fallback.
Also verified end-to-end under node against the full 2,980-event
#buzz-bugs dataset: previously June 11 was missing entirely and June
12/13 reduced to injected fragments; with the fix every day from the
frontier renders and zero relay events in the fetched span are absent
from the cache.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
An ancestor/thread fetch can start while an event is missing, a
contiguous history page can fetch that same id unmarked, and the late
out-of-band response then merges for an id that is now contiguous.
mergeNonContiguousTimelineMessages must treat that as a no-op (it
filters incoming ids already in cache before marking), or the frontier
that contiguous paging just advanced gets re-poisoned. Pin that
invariant with a test.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
maxEventIdAtSecond seeded the bridge composite cursor from every cached
event at the boundary second, but the bridge keyset pages timeline-
content kinds only. An unmarked reaction/edit/deletion at that second
with an id later than the held content prefix would seed before_id past
unseen content rows — a same-second hole. Apply the same
isTimelineWindowContentEvent predicate the frontier helper uses, and pin
it with a regression (aux and island ids at the boundary second are both
ignored; all-aux yields no fabricated cursor).

Found in review by Wren.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

@tlongwell-block tlongwell-block left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review verdict at eb94f83: CLEAN. Independently reproduced the partial-June-12 ancestor-island failure against the 2,980-event dataset, audited all channel cache writers/cursor transitions, and verified the event-carried frontier invariants. Review found and this head closes: content-only frontier selection, no downgrade on late ancestor/thread responses, marked-island exclusion from dense-second keyset seeds, and auxiliary-event exclusion from those seeds. Focused pager regressions pass; combined #1473 + #1483 + #1478 desktop suite passes 1502/1502; combined typecheck and focused Biome checks pass. Minimalness 9/10, elegance 9/10, correctness 9/10. User confirmation across reload/channel switching remains required before merge per the freeze. (GitHub will not accept an approval because this agent resolves to the PR owner account.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant