Skip to content

Speed up initial in-memory Soroban state population#5252

Open
drebelsky wants to merge 3 commits intostellar:masterfrom
drebelsky:faster-in-memory-pop
Open

Speed up initial in-memory Soroban state population#5252
drebelsky wants to merge 3 commits intostellar:masterfrom
drebelsky:faster-in-memory-pop

Conversation

@drebelsky
Copy link
Copy Markdown
Contributor

@drebelsky drebelsky commented May 5, 2026

Related to #4902. Note that since that time, state churn has continued, so population now takes ~70s on a dev watcher. This PR changes the live state calculation from going through the buckets one-by-one using a hash map to a k-way merge among all the buckets. The merge is done using a loser tree, which gives us about half as many comparisons as using a heap. Running on a dev watcher speeds up from ~70s to ~30s.

Time for 3 runs on upstream vs patch
26.0.2-3192.9b5bee752.noble~do~not~use~in~prd~perftests 300

2026-05-05T18:09:57.683 GB4ZO [Perf INFO] Populated in-memory Soroban state in 31.077 sec
2026-05-05T18:09:57.683 GB4ZO [Perf INFO] Startup state load took 32.777 sec (full=true)

2026-05-05T18:10:33.150 GA6IH [Perf INFO] Populated in-memory Soroban state in 30.273 sec
2026-05-05T18:10:33.151 GA6IH [Perf INFO] Startup state load took 31.619 sec (full=true)

2026-05-05T18:11:08.694 GDECF [Perf INFO] Populated in-memory Soroban state in 30.091 sec
2026-05-05T18:11:08.698 GDECF [Perf INFO] Startup state load took 31.317 sec (full=true)

26.0.2-3190.8a71e20af.noble~perftests 300

2026-05-05T18:05:38.067 GDDEI [Perf INFO] Populated in-memory Soroban state in 68.910 sec
2026-05-05T18:05:38.067 GDDEI [Perf INFO] Startup state load took 70.809 sec (full=true)

2026-05-05T18:07:14.023 GDYRH [Perf INFO] Populated in-memory Soroban state in 70.991 sec
2026-05-05T18:07:14.023 GDYRH [Perf INFO] Startup state load took 73.228 sec (full=true)

2026-05-05T18:08:35.801 GA4XC [Perf INFO] Populated in-memory Soroban state in 74.710 sec
2026-05-05T18:08:35.801 GA4XC [Perf INFO] Startup state load took 76.958 sec (full=true)

Doing the k-way merge also has nicer memory scaling characteristics than the current approach: the amount of memory we use scales with the live state + number of buckets, instead of the current approach that scales with churn.

Additionally, the PR disables bucket merges until after the in-memory state is populated.

Copilot AI review requested due to automatic review settings May 5, 2026 18:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR speeds up startup-time reconstruction of the in-memory Soroban state by changing live-state discovery from per-bucket deduping to a merged scan across all buckets, and by deferring bucket-merge restart until after full state population. It fits into the ledger/bucket startup path that rebuilds Soroban state from the BucketList on node startup.

Changes:

  • Replace initializeStateFromSnapshot’s per-type bucket scans with a new “current live entries” scan that returns only the latest live version of each key.
  • Add bucket-snapshot support for k-way merged live-entry scanning, including a new ledger-key comparator used by the loser-tree merge.
  • Split bucket merge restart out of assumeState and invoke it later in full startup mode.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/ledger/LedgerManagerImpl.cpp Defers restarting bucket merges until after full Soroban state setup.
src/ledger/InMemorySorobanState.cpp Switches snapshot initialization to current-live scans for Soroban entry types.
src/ledger/ImmutableLedgerView.h Exposes a new current-live scan API on immutable/apply ledger views.
src/ledger/ImmutableLedgerView.cpp Wires the new ledger-view scan API to the live bucket snapshot.
src/bucket/LedgerCmp.h Declares a 3-way comparator for LedgerKey ordering.
src/bucket/LedgerCmp.cpp Implements LedgerKey comparison logic used by merged scanning.
src/bucket/BucketManager.h Adds an explicit restartMerges API.
src/bucket/BucketManager.cpp Refactors merge restart out of assumeState into a separate method.
src/bucket/BucketListSnapshot.h Adds snapshot API for scanning only current live entries of a type.
src/bucket/BucketListSnapshot.cpp Implements the loser-tree/k-way merge scan over bucket entry streams.

Comment thread src/bucket/BucketListSnapshot.cpp Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants