Skip to content

Core: Coalesce consecutive position deletes into range inserts (Iceberg V2)#16052

Draft
Baunsgaard wants to merge 1 commit intoapache:mainfrom
Baunsgaard:coalesce-deletes-range-consumer
Draft

Core: Coalesce consecutive position deletes into range inserts (Iceberg V2)#16052
Baunsgaard wants to merge 1 commit intoapache:mainfrom
Baunsgaard:coalesce-deletes-range-consumer

Conversation

@Baunsgaard
Copy link
Copy Markdown
Contributor

@Baunsgaard Baunsgaard commented Apr 19, 2026

Add PositionDeleteRangeConsumer, a small stateless utility that walks a sequence of positions and dispatches consecutive runs as a single PositionDeleteIndex.delete(start, end) call instead of one delete(pos) per position. Sorted or partially sorted input yields maximal coalescing; unsorted input simply flushes more often with negligible overhead (one comparison per position).

This optimisation primarily benefits Iceberg V2 tables, where positional delete files are read row-by-row through Deletes.toPositionIndex(); V3 deletion vectors are already loaded as a serialised RoaringBitmap and bypass this path entirely.

forEach(Iterable<Long>, ...) the path wired into Deletes.toPositionIndex()

Scenario Baseline (ms) Consumer (ms) Speedup
FULL (contiguous) 69.92 14.18 4.93x
MEDIUM (~64 + gap) 70.40 16.96 4.15x
SHORT (~4 + gap) 71.73 57.32 1.25x
50% sparse random 78.16 81.16 0.96x
NONE (step=2) 71.56 74.89 0.96x

forEach(long[], ...) for callers that already hold a primitive array

Scenario Baseline (ms) Consumer (ms) Speedup
FULL (contiguous) 54.94 3.38 16.25x
MEDIUM (~64 + gap) 54.95 5.75 9.55x
SHORT (~4 + gap) 56.45 44.56 1.27x
50% sparse random 63.00 68.21 0.92x
NONE (step=2) 56.35 58.68 0.96x

Methodology: 4 back-to-back passes of (30 warmup + 51 measured iterations,
trimmed mean dropping the top/bottom 10) in the same JVM. The first pass
serves as a global JIT warm-up; the numbers below are the average of passes
2-4. Trimmed standard deviation across the measurement window was <= 1 ms in
every cell. Higher speedup is better.

Position-delete files are sorted by (file_path, pos), so real workloads
land on the FULL/MEDIUM/SHORT rows where the change yields 1.25x–4.93x
on the boxed Iterable<Long> path that Deletes.toPositionIndex()
actually uses, and 1.27x–16.25x on the primitive long[] overload
that future call sites can opt into.

Worst-case adversarial inputs (NONE, 50% sparse, no runs to coalesce at
all) are within ~4-8% of baseline on both overloads. The overhead is
one extra subtract-and-compare per position, more than recovered by the
wins on realistic sorted input.

The long[] overload's absolute numbers are lower than the
Iterable<Long> overload because boxing a Long per element is the
dominant cost on the iterable path (15–20 ms of the ~70 ms baseline). The
consumer cannot eliminate that cost, but it does cut the per-position
bitmap work proportionally.

@github-actions github-actions Bot added the core label Apr 19, 2026
@Baunsgaard Baunsgaard force-pushed the coalesce-deletes-range-consumer branch from 06105f9 to 8130781 Compare April 19, 2026 18:08
@Baunsgaard Baunsgaard marked this pull request as draft April 21, 2026 08:02
@Baunsgaard Baunsgaard force-pushed the coalesce-deletes-range-consumer branch 2 times, most recently from e16d218 to fa10273 Compare April 22, 2026 09:54
Add PositionDeleteRangeConsumer that coalesces runs of consecutive
positions into a single delete(start, end) call, and use it from
Deletes.toPositionIndex() so sorted position delete files are inserted
into the bitmap as ranges instead of one position at a time.
@Baunsgaard Baunsgaard force-pushed the coalesce-deletes-range-consumer branch from fa10273 to 24545db Compare April 22, 2026 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant