Skip to content

Conversation

@rluvaton
Copy link
Member

@rluvaton rluvaton commented Dec 31, 2025

Which issue does this PR close?

N/A

Rationale for this change

Making the row length calculation faster which result in faster row conversion

What changes are included in this PR?

  1. Instead of iterating over the rows and getting the length from the byte slice, we use the offsets directly, this
  2. Added 3 new APIs for Rows (explained below)

Are these changes tested?

Yes

Are there any user-facing changes?

Yes, added 3 functions to Rows:

  • row_len - get the row length at index
  • row_len_unchecked - get the row length at index without bound checks
  • lengths - get iterator over the lengths of the rows

Related to:

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 31, 2025
@rluvaton
Copy link
Member Author

run benchmark row_format

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing improve-performance-of-rows-encoding-for-calculating-rows-length (314bccb) to 843bee2 diff
BENCH_NAME=row_format
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench row_format
BENCH_FILTER=
BENCH_BRANCH_NAME=improve-performance-of-rows-encoding-for-calculating-rows-length
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                                                                         improve-performance-of-rows-encoding-for-calculating-rows-length    main
-----                                                                                                                         ----------------------------------------------------------------    ----
append_rows 10 large_list(0) of u64(0)                                                                                        1.06    679.0±9.64ns        ? ?/sec                                 1.00    641.6±3.59ns        ? ?/sec
append_rows 10 list(0) of u64(0)                                                                                              1.03    724.9±8.12ns        ? ?/sec                                 1.00    701.2±6.21ns        ? ?/sec
append_rows 4096 4096 string_dictionary(20, 0.5), string_dictionary(30, 0), string_dictionary(100, 0), i64(0)                 1.00    370.0±2.66µs        ? ?/sec                                 1.00    368.2±2.37µs        ? ?/sec
append_rows 4096 bool(0, 0.5)                                                                                                 1.00      8.6±0.15µs        ? ?/sec                                 1.00      8.6±0.04µs        ? ?/sec
append_rows 4096 bool(0.3, 0.5)                                                                                               1.00     17.0±0.16µs        ? ?/sec                                 1.00     17.0±0.16µs        ? ?/sec
append_rows 4096 i64(0)                                                                                                       1.00      7.7±0.23µs        ? ?/sec                                 1.01      7.8±0.20µs        ? ?/sec
append_rows 4096 i64(0.3)                                                                                                     1.00     15.3±0.16µs        ? ?/sec                                 1.01     15.4±0.64µs        ? ?/sec
append_rows 4096 large_list(0) of u64(0)                                                                                      1.04    169.2±1.98µs        ? ?/sec                                 1.00    163.0±1.18µs        ? ?/sec
append_rows 4096 large_list(0) sliced to 10 of u64(0)                                                                         1.05   960.2±14.65ns        ? ?/sec                                 1.00    916.8±9.70ns        ? ?/sec
append_rows 4096 list(0) of u64(0)                                                                                            1.00    165.0±0.67µs        ? ?/sec                                 1.01    166.2±1.54µs        ? ?/sec
append_rows 4096 list(0) sliced to 10 of u64(0)                                                                               1.02   1032.2±4.55ns        ? ?/sec                                 1.00   1014.1±6.97ns        ? ?/sec
append_rows 4096 string view(1..100, 0)                                                                                       1.00    114.4±1.49µs        ? ?/sec                                 1.00    114.9±1.59µs        ? ?/sec
append_rows 4096 string view(1..100, 0.5)                                                                                     1.01    103.3±4.52µs        ? ?/sec                                 1.00    102.6±0.69µs        ? ?/sec
append_rows 4096 string view(10, 0)                                                                                           1.00     52.0±0.36µs        ? ?/sec                                 1.00     52.1±1.26µs        ? ?/sec
append_rows 4096 string view(100, 0)                                                                                          1.00     75.9±1.26µs        ? ?/sec                                 1.01     76.5±0.81µs        ? ?/sec
append_rows 4096 string view(100, 0.5)                                                                                        1.00     85.3±1.02µs        ? ?/sec                                 1.00     85.5±0.61µs        ? ?/sec
append_rows 4096 string view(30, 0)                                                                                           1.00     54.1±0.24µs        ? ?/sec                                 1.01     54.4±1.76µs        ? ?/sec
append_rows 4096 string(10, 0)                                                                                                1.00     48.2±0.64µs        ? ?/sec                                 1.01     48.6±0.79µs        ? ?/sec
append_rows 4096 string(100, 0)                                                                                               1.00     71.5±0.79µs        ? ?/sec                                 1.01     72.0±0.78µs        ? ?/sec
append_rows 4096 string(100, 0.5)                                                                                             1.00     81.6±0.55µs        ? ?/sec                                 1.00     81.5±0.42µs        ? ?/sec
append_rows 4096 string(20, 0.5), string(30, 0), string(100, 0), i64(0)                                                       1.00    219.9±1.87µs        ? ?/sec                                 1.00    220.2±1.38µs        ? ?/sec
append_rows 4096 string(30, 0)                                                                                                1.00     49.3±0.15µs        ? ?/sec                                 1.00     49.4±0.38µs        ? ?/sec
append_rows 4096 string_dictionary(10, 0)                                                                                     1.01     75.3±0.62µs        ? ?/sec                                 1.00     74.7±0.52µs        ? ?/sec
append_rows 4096 string_dictionary(100, 0)                                                                                    1.01    145.5±3.81µs        ? ?/sec                                 1.00    144.1±1.76µs        ? ?/sec
append_rows 4096 string_dictionary(100, 0.5)                                                                                  1.01    108.8±1.26µs        ? ?/sec                                 1.00    108.1±1.73µs        ? ?/sec
append_rows 4096 string_dictionary(30, 0)                                                                                     1.01     78.5±0.95µs        ? ?/sec                                 1.00     78.1±0.17µs        ? ?/sec
append_rows 4096 string_dictionary_low_cardinality(10, 0)                                                                     1.01     27.2±0.52µs        ? ?/sec                                 1.00     26.9±0.51µs        ? ?/sec
append_rows 4096 string_dictionary_low_cardinality(100, 0)                                                                    1.00     46.5±0.29µs        ? ?/sec                                 1.02     47.5±1.05µs        ? ?/sec
append_rows 4096 string_dictionary_low_cardinality(30, 0)                                                                     1.00     27.2±0.24µs        ? ?/sec                                 1.01     27.3±0.17µs        ? ?/sec
append_rows 4096 u64(0)                                                                                                       1.00      7.6±0.11µs        ? ?/sec                                 1.01      7.7±0.11µs        ? ?/sec
append_rows 4096 u64(0.3)                                                                                                     1.00     14.6±0.11µs        ? ?/sec                                 1.00     14.7±0.16µs        ? ?/sec
convert_columns 10 large_list(0) of u64(0)                                                                                    1.02   964.5±34.57ns        ? ?/sec                                 1.00    945.3±4.36ns        ? ?/sec
convert_columns 10 list(0) of u64(0)                                                                                          1.00  1002.4±12.15ns        ? ?/sec                                 1.00  1002.3±16.59ns        ? ?/sec
convert_columns 4096 4096 string_dictionary(20, 0.5), string_dictionary(30, 0), string_dictionary(100, 0), i64(0)             1.01    373.4±2.62µs        ? ?/sec                                 1.00    370.5±3.57µs        ? ?/sec
convert_columns 4096 bool(0, 0.5)                                                                                             1.00      8.9±0.11µs        ? ?/sec                                 1.01      8.9±0.05µs        ? ?/sec
convert_columns 4096 bool(0.3, 0.5)                                                                                           1.00     17.3±0.14µs        ? ?/sec                                 1.00     17.2±0.11µs        ? ?/sec
convert_columns 4096 i64(0)                                                                                                   1.02      8.1±0.24µs        ? ?/sec                                 1.00      8.0±0.15µs        ? ?/sec
convert_columns 4096 i64(0.3)                                                                                                 1.00     15.5±0.30µs        ? ?/sec                                 1.00     15.5±0.23µs        ? ?/sec
convert_columns 4096 large_list(0) of u64(0)                                                                                  1.03    169.1±0.53µs        ? ?/sec                                 1.00    163.4±4.11µs        ? ?/sec
convert_columns 4096 large_list(0) sliced to 10 of u64(0)                                                                     1.05  1261.6±37.35ns        ? ?/sec                                 1.00   1206.5±6.41ns        ? ?/sec
convert_columns 4096 list(0) of u64(0)                                                                                        1.00    166.4±2.33µs        ? ?/sec                                 1.00    166.5±1.80µs        ? ?/sec
convert_columns 4096 list(0) sliced to 10 of u64(0)                                                                           1.01  1336.2±17.33ns        ? ?/sec                                 1.00   1325.0±6.66ns        ? ?/sec
convert_columns 4096 string view(1..100, 0)                                                                                   1.00    114.7±0.51µs        ? ?/sec                                 1.00    114.9±0.50µs        ? ?/sec
convert_columns 4096 string view(1..100, 0.5)                                                                                 1.00    103.0±0.83µs        ? ?/sec                                 1.00    103.2±1.40µs        ? ?/sec
convert_columns 4096 string view(10, 0)                                                                                       1.01     53.0±0.23µs        ? ?/sec                                 1.00     52.5±0.24µs        ? ?/sec
convert_columns 4096 string view(100, 0)                                                                                      1.00     77.1±0.88µs        ? ?/sec                                 1.01     77.9±1.87µs        ? ?/sec
convert_columns 4096 string view(100, 0.5)                                                                                    1.00     86.2±1.09µs        ? ?/sec                                 1.00     86.1±0.41µs        ? ?/sec
convert_columns 4096 string view(30, 0)                                                                                       1.01     55.3±2.25µs        ? ?/sec                                 1.00     54.7±1.14µs        ? ?/sec
convert_columns 4096 string(10, 0)                                                                                            1.00     48.1±0.38µs        ? ?/sec                                 1.01     48.8±0.74µs        ? ?/sec
convert_columns 4096 string(100, 0)                                                                                           1.00     72.5±1.56µs        ? ?/sec                                 1.00     72.2±0.68µs        ? ?/sec
convert_columns 4096 string(100, 0.5)                                                                                         1.00     82.2±0.41µs        ? ?/sec                                 1.00     82.0±0.38µs        ? ?/sec
convert_columns 4096 string(20, 0.5), string(30, 0), string(100, 0), i64(0)                                                   1.00    220.9±1.82µs        ? ?/sec                                 1.01    223.3±3.11µs        ? ?/sec
convert_columns 4096 string(30, 0)                                                                                            1.00     49.7±0.20µs        ? ?/sec                                 1.00     49.8±0.51µs        ? ?/sec
convert_columns 4096 string_dictionary(10, 0)                                                                                 1.00     76.8±1.09µs        ? ?/sec                                 1.00     76.7±0.82µs        ? ?/sec
convert_columns 4096 string_dictionary(100, 0)                                                                                1.01    147.5±1.48µs        ? ?/sec                                 1.00    145.8±1.12µs        ? ?/sec
convert_columns 4096 string_dictionary(100, 0.5)                                                                              1.00    109.8±0.74µs        ? ?/sec                                 1.00    109.3±1.28µs        ? ?/sec
convert_columns 4096 string_dictionary(30, 0)                                                                                 1.00     79.4±0.41µs        ? ?/sec                                 1.00     79.5±0.38µs        ? ?/sec
convert_columns 4096 string_dictionary_low_cardinality(10, 0)                                                                 1.01     28.3±0.33µs        ? ?/sec                                 1.00     27.9±0.28µs        ? ?/sec
convert_columns 4096 string_dictionary_low_cardinality(100, 0)                                                                1.00     47.5±0.15µs        ? ?/sec                                 1.00     47.5±1.34µs        ? ?/sec
convert_columns 4096 string_dictionary_low_cardinality(30, 0)                                                                 1.01     28.3±0.45µs        ? ?/sec                                 1.00     28.1±0.67µs        ? ?/sec
convert_columns 4096 u64(0)                                                                                                   1.00      7.8±0.13µs        ? ?/sec                                 1.00      7.9±0.12µs        ? ?/sec
convert_columns 4096 u64(0.3)                                                                                                 1.00     14.9±0.08µs        ? ?/sec                                 1.00     14.8±0.13µs        ? ?/sec
convert_columns_prepared 10 large_list(0) of u64(0)                                                                           1.08    759.0±8.64ns        ? ?/sec                                 1.00   700.4±11.06ns        ? ?/sec
convert_columns_prepared 10 list(0) of u64(0)                                                                                 1.06   802.5±21.76ns        ? ?/sec                                 1.00    756.5±7.68ns        ? ?/sec
convert_columns_prepared 4096 4096 string_dictionary(20, 0.5), string_dictionary(30, 0), string_dictionary(100, 0), i64(0)    1.00    370.8±3.83µs        ? ?/sec                                 1.00   370.1±16.68µs        ? ?/sec
convert_columns_prepared 4096 bool(0, 0.5)                                                                                    1.00      8.8±0.05µs        ? ?/sec                                 1.00      8.7±0.08µs        ? ?/sec
convert_columns_prepared 4096 bool(0.3, 0.5)                                                                                  1.00     17.2±0.22µs        ? ?/sec                                 1.00     17.2±0.12µs        ? ?/sec
convert_columns_prepared 4096 i64(0)                                                                                          1.00      7.8±0.14µs        ? ?/sec                                 1.00      7.8±0.13µs        ? ?/sec
convert_columns_prepared 4096 i64(0.3)                                                                                        1.01     15.5±0.15µs        ? ?/sec                                 1.00     15.4±0.23µs        ? ?/sec
convert_columns_prepared 4096 large_list(0) of u64(0)                                                                         1.04    169.5±0.54µs        ? ?/sec                                 1.00    162.7±0.48µs        ? ?/sec
convert_columns_prepared 4096 large_list(0) sliced to 10 of u64(0)                                                            1.05  1049.8±22.05ns        ? ?/sec                                 1.00  1001.7±38.20ns        ? ?/sec
convert_columns_prepared 4096 list(0) of u64(0)                                                                               1.00    165.4±0.67µs        ? ?/sec                                 1.00    166.2±0.84µs        ? ?/sec
convert_columns_prepared 4096 list(0) sliced to 10 of u64(0)                                                                  1.04  1153.0±12.41ns        ? ?/sec                                 1.00  1106.6±29.55ns        ? ?/sec
convert_columns_prepared 4096 string view(1..100, 0)                                                                          1.00    114.8±1.38µs        ? ?/sec                                 1.00    114.6±0.41µs        ? ?/sec
convert_columns_prepared 4096 string view(1..100, 0.5)                                                                        1.00    103.0±0.85µs        ? ?/sec                                 1.00    102.9±0.90µs        ? ?/sec
convert_columns_prepared 4096 string view(10, 0)                                                                              1.01     52.6±0.42µs        ? ?/sec                                 1.00     52.2±0.96µs        ? ?/sec
convert_columns_prepared 4096 string view(100, 0)                                                                             1.00     76.1±0.73µs        ? ?/sec                                 1.00     76.1±1.04µs        ? ?/sec
convert_columns_prepared 4096 string view(100, 0.5)                                                                           1.00     85.9±0.40µs        ? ?/sec                                 1.00     85.6±0.41µs        ? ?/sec
convert_columns_prepared 4096 string view(30, 0)                                                                              1.00     54.0±1.03µs        ? ?/sec                                 1.01     54.4±2.40µs        ? ?/sec
convert_columns_prepared 4096 string(10, 0)                                                                                   1.00     47.8±0.19µs        ? ?/sec                                 1.01     48.5±0.22µs        ? ?/sec
convert_columns_prepared 4096 string(100, 0)                                                                                  1.00     71.9±0.71µs        ? ?/sec                                 1.00     71.7±0.87µs        ? ?/sec
convert_columns_prepared 4096 string(100, 0.5)                                                                                1.01     82.4±3.36µs        ? ?/sec                                 1.00     81.9±0.30µs        ? ?/sec
convert_columns_prepared 4096 string(20, 0.5), string(30, 0), string(100, 0), i64(0)                                          1.00    220.9±3.14µs        ? ?/sec                                 1.02   224.2±13.62µs        ? ?/sec
convert_columns_prepared 4096 string(30, 0)                                                                                   1.00     49.7±0.74µs        ? ?/sec                                 1.00     49.6±0.50µs        ? ?/sec
convert_columns_prepared 4096 string_dictionary(10, 0)                                                                        1.00     75.3±0.73µs        ? ?/sec                                 1.00     75.2±1.23µs        ? ?/sec
convert_columns_prepared 4096 string_dictionary(100, 0)                                                                       1.00    144.4±1.75µs        ? ?/sec                                 1.00    144.5±2.05µs        ? ?/sec
convert_columns_prepared 4096 string_dictionary(100, 0.5)                                                                     1.02    109.3±0.57µs        ? ?/sec                                 1.00    107.6±1.28µs        ? ?/sec
convert_columns_prepared 4096 string_dictionary(30, 0)                                                                        1.00     78.8±0.26µs        ? ?/sec                                 1.00     78.4±0.66µs        ? ?/sec
convert_columns_prepared 4096 string_dictionary_low_cardinality(10, 0)                                                        1.02     27.4±0.68µs        ? ?/sec                                 1.00     27.0±0.28µs        ? ?/sec
convert_columns_prepared 4096 string_dictionary_low_cardinality(100, 0)                                                       1.00     46.8±0.29µs        ? ?/sec                                 1.00     46.6±0.21µs        ? ?/sec
convert_columns_prepared 4096 string_dictionary_low_cardinality(30, 0)                                                        1.00     27.3±0.46µs        ? ?/sec                                 1.01     27.5±0.80µs        ? ?/sec
convert_columns_prepared 4096 u64(0)                                                                                          1.00      7.8±0.14µs        ? ?/sec                                 1.01      7.8±0.17µs        ? ?/sec
convert_columns_prepared 4096 u64(0.3)                                                                                        1.00     14.7±0.13µs        ? ?/sec                                 1.00     14.8±0.26µs        ? ?/sec
convert_rows 10 large_list(0) of u64(0)                                                                                       1.03  1562.8±27.98ns        ? ?/sec                                 1.00   1520.5±8.11ns        ? ?/sec
convert_rows 10 list(0) of u64(0)                                                                                             1.02  1734.5±30.12ns        ? ?/sec                                 1.00  1706.9±37.56ns        ? ?/sec
convert_rows 4096 4096 string_dictionary(20, 0.5), string_dictionary(30, 0), string_dictionary(100, 0), i64(0)                1.00    298.8±5.58µs        ? ?/sec                                 1.03   306.3±14.82µs        ? ?/sec
convert_rows 4096 bool(0, 0.5)                                                                                                1.00     16.5±0.46µs        ? ?/sec                                 1.01     16.6±1.30µs        ? ?/sec
convert_rows 4096 bool(0.3, 0.5)                                                                                              1.00     16.5±0.34µs        ? ?/sec                                 1.00     16.5±0.29µs        ? ?/sec
convert_rows 4096 i64(0)                                                                                                      1.00     34.8±0.34µs        ? ?/sec                                 1.00     34.7±0.13µs        ? ?/sec
convert_rows 4096 i64(0.3)                                                                                                    1.01     34.9±1.65µs        ? ?/sec                                 1.00     34.7±0.38µs        ? ?/sec
convert_rows 4096 large_list(0) of u64(0)                                                                                     1.00    268.0±9.28µs        ? ?/sec                                 1.01    269.7±1.45µs        ? ?/sec
convert_rows 4096 large_list(0) sliced to 10 of u64(0)                                                                        1.06      2.1±0.03µs        ? ?/sec                                 1.00  1960.1±24.29ns        ? ?/sec
convert_rows 4096 list(0) of u64(0)                                                                                           1.00    269.7±1.08µs        ? ?/sec                                 1.00    269.6±3.09µs        ? ?/sec
convert_rows 4096 list(0) sliced to 10 of u64(0)                                                                              1.03      2.2±0.04µs        ? ?/sec                                 1.00      2.2±0.02µs        ? ?/sec
convert_rows 4096 string view(1..100, 0)                                                                                      1.01    176.5±4.01µs        ? ?/sec                                 1.00    175.4±0.73µs        ? ?/sec
convert_rows 4096 string view(1..100, 0.5)                                                                                    1.00    140.6±1.06µs        ? ?/sec                                 1.00    141.0±0.80µs        ? ?/sec
convert_rows 4096 string view(10, 0)                                                                                          1.00     83.3±0.60µs        ? ?/sec                                 1.01     84.3±3.60µs        ? ?/sec
convert_rows 4096 string view(100, 0)                                                                                         1.01    130.1±7.97µs        ? ?/sec                                 1.00    128.6±1.45µs        ? ?/sec
convert_rows 4096 string view(100, 0.5)                                                                                       1.01    119.1±0.80µs        ? ?/sec                                 1.00    117.9±1.11µs        ? ?/sec
convert_rows 4096 string view(30, 0)                                                                                          1.01     95.1±5.43µs        ? ?/sec                                 1.00     94.0±0.44µs        ? ?/sec
convert_rows 4096 string(10, 0)                                                                                               1.00     60.3±0.30µs        ? ?/sec                                 1.00     60.3±0.98µs        ? ?/sec
convert_rows 4096 string(100, 0)                                                                                              1.00    110.3±2.16µs        ? ?/sec                                 1.01    111.1±3.71µs        ? ?/sec
convert_rows 4096 string(100, 0.5)                                                                                            1.00    103.3±0.84µs        ? ?/sec                                 1.00    103.8±3.52µs        ? ?/sec
convert_rows 4096 string(20, 0.5), string(30, 0), string(100, 0), i64(0)                                                      1.00    302.0±3.21µs        ? ?/sec                                 1.02    307.4±7.47µs        ? ?/sec
convert_rows 4096 string(30, 0)                                                                                               1.00     72.6±2.06µs        ? ?/sec                                 1.01     73.4±3.75µs        ? ?/sec
convert_rows 4096 string_dictionary(10, 0)                                                                                    1.00     60.4±0.76µs        ? ?/sec                                 1.00     60.7±0.65µs        ? ?/sec
convert_rows 4096 string_dictionary(100, 0)                                                                                   1.00    110.4±2.89µs        ? ?/sec                                 1.01    111.4±1.99µs        ? ?/sec
convert_rows 4096 string_dictionary(100, 0.5)                                                                                 1.00    103.6±1.39µs        ? ?/sec                                 1.00    104.1±2.15µs        ? ?/sec
convert_rows 4096 string_dictionary(30, 0)                                                                                    1.00     72.9±1.13µs        ? ?/sec                                 1.00     72.9±1.67µs        ? ?/sec
convert_rows 4096 string_dictionary_low_cardinality(10, 0)                                                                    1.00     60.3±0.73µs        ? ?/sec                                 1.00     60.3±0.19µs        ? ?/sec
convert_rows 4096 string_dictionary_low_cardinality(100, 0)                                                                   1.00    110.0±1.84µs        ? ?/sec                                 1.00    110.5±2.12µs        ? ?/sec
convert_rows 4096 string_dictionary_low_cardinality(30, 0)                                                                    1.01     73.2±4.36µs        ? ?/sec                                 1.00     72.5±0.43µs        ? ?/sec
convert_rows 4096 u64(0)                                                                                                      1.00     32.0±0.23µs        ? ?/sec                                 1.00     32.0±0.26µs        ? ?/sec
convert_rows 4096 u64(0.3)                                                                                                    1.00     32.0±0.13µs        ? ?/sec                                 1.00     32.0±0.33µs        ? ?/sec
iterate rows                                                                                                                  1.00      2.6±0.07µs        ? ?/sec                                 1.00      2.6±0.01µs        ? ?/sec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants