fix: prevent arithmetic overflow in U64Segment encoding selection for sparse/extreme row id ranges#6516
Conversation
… sparse/extreme row ID ranges
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Are you actually encountering this in practice? u64::MAX is huge. You would need to add 1 billion rows per second for over 500 years to get here. Or are you triggering this in some other way besides data volume? Is there some way to manually specify the row ids? Are we using u64::MAX style sentinels in some place? |
hey @westonpace, thank you for taking a look! Good points, let me explain how I encountered this problem. Sorry for not adding more context in the first place. I used trivial operations, talking about few rows, found it during my initial debugging for #6465.
Fix: inherit the flag from the existing manifest via
let orig_frag_id = row_id >> 32;
let row_offset = (row_id & 0xFFFFFFFF) as usize;Stable row IDs are sequential integers from
let total_slots = self.max - self.min + 1; // wraps to 0 when min=0, max=u64::MAX
let range_with_holes = 24 + 4 * n_holes as usize; // overflows when n_holes ≈ 2^63Sequential stable row IDs starting from 0 never reach these extremes in practice, so this wasn't triggered by the minimal test. The fix (promoting intermediate arithmetic to |
U64Segment::from_stats_and_sequencecrashes when row IDs span a large range or include values nearu64::MAX. Fixes #6515There are two independent overflow classes:
Cost estimation:
n_holes()andsorted_sequence_sizes()compute range spans inu64/usizethat wrap for large ranges, making infeasible encodings (RangeWithHoles, RangeWithBitmap) appear cheapest. The code then attempts to materialize billions of holes or allocate multi-exabyte bitmaps.Exclusive-end: All range-backed encodings construct
Range<u64>withstats.max + 1as the exclusive end. Whenmax == u64::MAX, this overflows even for small, memory-feasible sets (e.g.,[u64::MAX - 3, u64::MAX - 1, u64::MAX]).Both classes cause process aborts in debug and OOM in release. Across JNI this kills the JVM with no recoverable exception.
Fix
n_holes()→u128return type: The total slot countmax - min + 1can be up to2^64, which exceedsu64::MAX. Widening tou128gives the correct value instead of wrapping.sorted_sequence_sizes()→u128arithmetic: All cost estimates computed inu128with saturating arithmetic, then converted viausize::try_from(...).unwrap_or(usize::MAX). Infeasible encodings saturate and always lose themin()comparison.from_stats_and_sequence()→checked_add(1)gate:exclusive_end = stats.max.checked_add(1)computed once and used as a gate for all range-backed branches. WhenNone(i.e.,max == u64::MAX), falls through toSortedArray. The bare expressionstats.max + 1no longer appears in the function.