Skip to content

perf(upgrade): apply delta patch chains in memory with cached base reads#1127

Merged
BYK merged 2 commits into
mainfrom
byk/perf/delta-upgrade-in-memory-chain
Jun 23, 2026
Merged

perf(upgrade): apply delta patch chains in memory with cached base reads#1127
BYK merged 2 commits into
mainfrom
byk/perf/delta-upgrade-in-memory-chain

Conversation

@BYK

@BYK BYK commented Jun 23, 2026

Copy link
Copy Markdown
Member

Why

Delta upgrades that span multiple releases were slower than they should be. The patch-application path wrote every intermediate binary to disk, copied it to a temp file, and recomputed its SHA-256 — N-1 redundant disk round-trips and full-binary hash passes per chain — even though only the final binary needs verifying. The base binary was also loaded whole (~100 MB) into the JS heap.

What changed

Three layered improvements to patch application, all behind the existing applyPatchChain API (no caller changes):

  1. In-memory patch chains. Intermediate hops are kept in memory and fed straight into the next patch; only the final binary is streamed to disk and hashed. No more .patching.a/.b scratch files, per-hop temp copies, or discarded intermediate hashes. Because reads and writes never target the same path, the read/write truncation hazard is gone too.

  2. On-demand base reads (pread). The base ("old") binary is read through a new OldReader abstraction instead of being loaded whole into the heap. FileOldReader uses positional reads against the reflink temp copy; intermediates use a cacheless MemoryOldReader. Single-patch peak heap drops from ~100 MB to ~KB. A redundant new Uint8Array(readFile()) copy of the base is also dropped.

  3. Bounded read-ahead cache. FileOldReader keeps a single reused 1 MiB block, refilling only on a miss, with a direct-read bypass for windows larger than the block. This coalesces bsdiff's many small windowed reads (mostly forward) into ~fileSize / 1 MiB reads — neutralizing the per-window syscall overhead that on-demand reads would otherwise add to the common single-patch path. Memory stays bounded at ~1 MiB, and only for the first hop.

Files

  • src/lib/bspatch.ts — extracted transformPatch core (callback sink); added OldReader (FileOldReader with block cache + MemoryOldReader), applyPatchToMemory, and applyPatchChainInMemory; applyPatch is now a thin wrapper.
  • src/lib/delta-upgrade.tsapplyPatchChain delegates to applyPatchChainInMemory; removed the disk-intermediate alternation and scratch-file cleanup.

Testing

  • Multi-hop chain (hand-built TRDIFF10 patches via zstdCompressSync, no external zig-bsdiff needed) — exercises both reader paths and the chaining loop; asserts no scratch files on disk.
  • Memory/pread equivalence — cross-checks MemoryOldReader vs the cached FileOldReader against the real fixture patch (diff + extra + seek).
  • Block-cache coverage — multi-block spanning, large-window bypass, and exact block-boundary landing (cross-checked via SHA-256).
  • delta-upgrade.test.ts — rewrote the multi-step test to assert intermediates never touch disk.
  • e2e — added a real two-hop chain (old → new → new) using zig-bsdiff.
  • Mutation-verified the multi-hop and cache tests fail when the chain threading / blockCovers range check is broken.

pnpm run typecheck, pnpm run lint, and the affected suites (bspatch, delta-upgrade, upgrade — 233 tests) all pass.

Notes

  • Pure refactor of patch application — output is byte-identical; the final SHA-256 verification is unchanged.
  • The common single-patch path keeps its speed (the block cache offsets the on-demand reads); multi-patch chains are where the disk/hash savings land.

Multi-patch delta upgrades previously wrote each intermediate binary to disk, copied it to a temp file, and recomputed its SHA-256 — N-1 redundant disk round-trips and hash passes per chain. Patch application now keeps intermediates in memory and only persists and hashes the final binary.

The base ("old") binary is read on demand via positional reads (pread) through an OldReader abstraction instead of being loaded whole into the JS heap, dropping the single-patch peak heap from ~100 MB to ~KB. A bounded 1 MiB read-ahead cache coalesces bsdiff's many small windowed reads into a handful of block reads, avoiding a per-window syscall storm. A redundant Uint8Array copy of the base is also dropped.

- bspatch.ts: extract transformPatch core; add OldReader (fd-backed pread with block cache + in-memory) and applyPatchChainInMemory; applyPatch is now a thin wrapper
- delta-upgrade.ts: applyPatchChain delegates to applyPatchChainInMemory; remove disk-intermediate alternation and scratch-file cleanup
- tests: multi-hop chain, memory/pread equivalence, block-cache boundary coverage, no-intermediate-files assertions, e2e two-hop chain
@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor
PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://cli.sentry.dev/_preview/pr-1127/

Built to branch gh-pages at 2026-06-23 14:11 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 348a7fe. Configure here.

Comment thread src/lib/bspatch.ts
// Data is in JS heap — no temp file to clean up
},
};
return new MemoryOldReader(await readFile(oldPath));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temp copy discarded on open failure

Medium Severity

After a successful CoW copy to the temp path, if open or stat fails, the catch block unlinks that temp file and loads the base via readFile(oldPath). The previous path always read the temp copy when copy succeeded, which avoids reading the running executable and matches the ETXTBSY/AMFI rationale.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 348a7fe. Configure here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d0572ec. The catch now closes the open FileHandle (tracked outside the try) before falling back to the in-memory read of the original, so the fd is no longer leaked and the temp copy is still unlinked.

Comment thread src/lib/bspatch.ts Outdated
@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Codecov Results 📊

✅ Patch coverage is 89.11%. Project has 5051 uncovered lines.
✅ Project coverage is 81.33%. Comparing base (base) to head (head).

Files with missing lines (1)
File Patch % Lines
src/lib/bspatch.ts 88.89% ⚠️ 11 Missing and 9 partials
Coverage diff
@@            Coverage Diff             @@
##          main       #PR       +/-##
==========================================
+ Coverage    81.31%    81.33%    +0.02%
==========================================
  Files          392       392         —
  Lines        27009     27061       +52
  Branches     17545     17558       +13
==========================================
+ Hits         21963     22010       +47
- Misses        5046      5051        +5
- Partials      1824      1832        +8

Generated by Codecov Action

If open() succeeded but a later step (stat) threw, the catch block fell back to an in-memory read without closing the already-open FileHandle, leaking a file descriptor. Track the handle outside the try and close it in the catch before falling back. Flagged by Cursor Bugbot and Seer on #1127.
@BYK BYK merged commit 25f0f2c into main Jun 23, 2026
36 of 41 checks passed
@BYK BYK deleted the byk/perf/delta-upgrade-in-memory-chain branch June 23, 2026 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant