Skip to content

Backport PR #4062: perf: `numba` based aggregations for sparse data

39a29de
Select commit
Loading
Failed to load commit list.
Merged

Backport PR #4062 on branch 1.12.x (perf: numba based aggregations for sparse data) #4064

Backport PR #4062: perf: `numba` based aggregations for sparse data
39a29de
Select commit
Loading
Failed to load commit list.
scverse-benchmark / benchmark succeeded Apr 16, 2026 in 2h 10m 31s

Benchmark

Benchmark run successful

Details

All benchmarks:

Change Before [f2268d9] <1.12.1> After [39a29de] Ratio Benchmark (Parameter)
5.48G 5.14G 0.94 preprocessing_counts.Agg.peakmem_agg('count_nonzero')
- 4.91G 4.05G 0.82 preprocessing_counts.Agg.peakmem_agg('mean')
4.41G 4.41G 1.00 preprocessing_counts.Agg.peakmem_agg('median')
- 4.91G 4.05G 0.82 preprocessing_counts.Agg.peakmem_agg('sum')
- 5.83G 4.06G 0.70 preprocessing_counts.Agg.peakmem_agg('var')
- 884±2ms 640±2ms 0.72 preprocessing_counts.Agg.time_agg('count_nonzero')
- 550±2ms 90.2±0.5ms 0.16 preprocessing_counts.Agg.time_agg('mean')
3.30±0.02s 3.36±0.03s 1.02 preprocessing_counts.Agg.time_agg('median')
- 547±0.7ms 89.0±1ms 0.16 preprocessing_counts.Agg.time_agg('sum')
- 1.37±0s 132±2ms 0.10 preprocessing_counts.Agg.time_agg('var')
311M 312M 1.00 preprocessing_counts.FastSuite.peakmem_calculate_qc_metrics('bmmc', 'counts')
311M 312M 1.00 preprocessing_counts.FastSuite.peakmem_calculate_qc_metrics('bmmc', 'counts-off-axis')
4.11G 4.11G 1.00 preprocessing_counts.FastSuite.peakmem_calculate_qc_metrics('lung93k', 'counts')
4.11G 4.11G 1.00 preprocessing_counts.FastSuite.peakmem_calculate_qc_metrics('lung93k', 'counts-off-axis')
378M 378M 1.00 preprocessing_counts.FastSuite.peakmem_calculate_qc_metrics('pbmc3k', 'counts')
378M 377M 1.00 preprocessing_counts.FastSuite.peakmem_calculate_qc_metrics('pbmc3k', 'counts-off-axis')
289M 288M 1.00 preprocessing_counts.FastSuite.peakmem_calculate_qc_metrics('pbmc68k_reduced', 'counts')
288M 289M 1.00 preprocessing_counts.FastSuite.peakmem_calculate_qc_metrics('pbmc68k_reduced', 'counts-off-axis')
314M 314M 1.00 preprocessing_counts.FastSuite.peakmem_log1p('bmmc', 'counts')
314M 314M 1.00 preprocessing_counts.FastSuite.peakmem_log1p('bmmc', 'counts-off-axis')
4.45G 4.45G 1.00 preprocessing_counts.FastSuite.peakmem_log1p('lung93k', 'counts')
4.45G 4.45G 1.00 preprocessing_counts.FastSuite.peakmem_log1p('lung93k', 'counts-off-axis')
385M 385M 1.00 preprocessing_counts.FastSuite.peakmem_log1p('pbmc3k', 'counts')
385M 384M 1.00 preprocessing_counts.FastSuite.peakmem_log1p('pbmc3k', 'counts-off-axis')
288M 289M 1.00 preprocessing_counts.FastSuite.peakmem_log1p('pbmc68k_reduced', 'counts')
288M 288M 1.00 preprocessing_counts.FastSuite.peakmem_log1p('pbmc68k_reduced', 'counts-off-axis')
408M 408M 1.00 preprocessing_counts.FastSuite.peakmem_normalize_total('bmmc', 'counts')
406M 403M 0.99 preprocessing_counts.FastSuite.peakmem_normalize_total('bmmc', 'counts-off-axis')
4.97G 4.96G 1.00 preprocessing_counts.FastSuite.peakmem_normalize_total('lung93k', 'counts')
4.96G 4.97G 1.00 preprocessing_counts.FastSuite.peakmem_normalize_total('lung93k', 'counts-off-axis')
474M 474M 1.00 preprocessing_counts.FastSuite.peakmem_normalize_total('pbmc3k', 'counts')
474M 474M 1.00 preprocessing_counts.FastSuite.peakmem_normalize_total('pbmc3k', 'counts-off-axis')
289M 288M 1.00 preprocessing_counts.FastSuite.peakmem_normalize_total('pbmc68k_reduced', 'counts')
289M 288M 1.00 preprocessing_counts.FastSuite.peakmem_normalize_total('pbmc68k_reduced', 'counts-off-axis')
12.6±0.2ms 12.8±0.3ms 1.02 preprocessing_counts.FastSuite.time_calculate_qc_metrics('bmmc', 'counts')
12.3±0.2ms 12.5±0.2ms 1.01 preprocessing_counts.FastSuite.time_calculate_qc_metrics('bmmc', 'counts-off-axis')
2.06±0.01s 2.06±0s 1.00 preprocessing_counts.FastSuite.time_calculate_qc_metrics('lung93k', 'counts')
1.64±0s 1.61±0s 0.98 preprocessing_counts.FastSuite.time_calculate_qc_metrics('lung93k', 'counts-off-axis')
38.0±0.7ms 38.0±0.7ms 1.00 preprocessing_counts.FastSuite.time_calculate_qc_metrics('pbmc3k', 'counts')
27.9±0.9ms 28.3±1ms 1.01 preprocessing_counts.FastSuite.time_calculate_qc_metrics('pbmc3k', 'counts-off-axis')
4.72±0.05ms 4.69±0.05ms 0.99 preprocessing_counts.FastSuite.time_calculate_qc_metrics('pbmc68k_reduced', 'counts')
4.65±0.07ms 5.13±0.5ms ~1.10 preprocessing_counts.FastSuite.time_calculate_qc_metrics('pbmc68k_reduced', 'counts-off-axis')
1.53±0.01ms 1.51±0.02ms 0.99 preprocessing_counts.FastSuite.time_log1p('bmmc', 'counts')
1.64±0.01ms 1.55±0.02ms 0.95 preprocessing_counts.FastSuite.time_log1p('bmmc', 'counts-off-axis')
634±1ms 644±1ms 1.02 preprocessing_counts.FastSuite.time_log1p('lung93k', 'counts')
668±30ms 638±3ms 0.96 preprocessing_counts.FastSuite.time_log1p('lung93k', 'counts-off-axis')
7.03±0.1ms 7.15±0.07ms 1.02 preprocessing_counts.FastSuite.time_log1p('pbmc3k', 'counts')
8.45±0.1ms 7.19±0.1ms ~0.85 preprocessing_counts.FastSuite.time_log1p('pbmc3k', 'counts-off-axis')
390±4μs 386±1μs 0.99 preprocessing_counts.FastSuite.time_log1p('pbmc68k_reduced', 'counts')
- 453±3μs 387±4μs 0.85 preprocessing_counts.FastSuite.time_log1p('pbmc68k_reduced', 'counts-off-axis')
2.69±0.3ms 2.66±0.2ms 0.99 preprocessing_counts.FastSuite.time_normalize_total('bmmc', 'counts')
8.28±1ms 6.75±0.1ms ~0.82 preprocessing_counts.FastSuite.time_normalize_total('bmmc', 'counts-off-axis')
552±1ms 551±9ms 1.00 preprocessing_counts.FastSuite.time_normalize_total('lung93k', 'counts')
2.64±0.04s 2.79±0s 1.06 preprocessing_counts.FastSuite.time_normalize_total('lung93k', 'counts-off-axis')
8.66±0.6ms 8.70±0.5ms 1.01 preprocessing_counts.FastSuite.time_normalize_total('pbmc3k', 'counts')
33.7±0.6ms 33.9±0.9ms 1.01 preprocessing_counts.FastSuite.time_normalize_total('pbmc3k', 'counts-off-axis')
551±0.6μs 550±1μs 1.00 preprocessing_counts.FastSuite.time_normalize_total('pbmc68k_reduced', 'counts')
549±0.8μs 545±0.7μs 0.99 preprocessing_counts.FastSuite.time_normalize_total('pbmc68k_reduced', 'counts-off-axis')
431M 431M 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_filter_cells('pbmc3k', 'counts')
431M 432M 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_filter_cells('pbmc3k', 'counts-off-axis')
299M 300M 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_filter_cells('pbmc68k_reduced', 'counts')
300M 300M 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_filter_cells('pbmc68k_reduced', 'counts-off-axis')
431M 431M 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_filter_genes('pbmc3k', 'counts')
431M 431M 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_filter_genes('pbmc3k', 'counts-off-axis')
299M 300M 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_filter_genes('pbmc68k_reduced', 'counts')
299M 300M 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_filter_genes('pbmc68k_reduced', 'counts-off-axis')
1.11G 1.11G 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_scrublet('pbmc3k', 'counts')
1.11G 1.11G 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_scrublet('pbmc3k', 'counts-off-axis')
524M 526M 1.00 preprocessing_counts.PreprocessingCountsSuite.peakmem_scrublet('pbmc68k_reduced', 'counts')
526M 523M 0.99 preprocessing_counts.PreprocessingCountsSuite.peakmem_scrublet('pbmc68k_reduced', 'counts-off-axis')
55.8±0.6ms 56.3±0.8ms 1.01 preprocessing_counts.PreprocessingCountsSuite.time_filter_cells('pbmc3k', 'counts')
59.1±0.8ms 58.6±0.6ms 0.99 preprocessing_counts.PreprocessingCountsSuite.time_filter_cells('pbmc3k', 'counts-off-axis')
9.69±0.8ms 9.71±0.6ms 1.00 preprocessing_counts.PreprocessingCountsSuite.time_filter_cells('pbmc68k_reduced', 'counts')
9.98±0.8ms 10.1±0.8ms 1.02 preprocessing_counts.PreprocessingCountsSuite.time_filter_cells('pbmc68k_reduced', 'counts-off-axis')
51.8±0.4ms 52.5±0.5ms 1.01 preprocessing_counts.PreprocessingCountsSuite.time_filter_genes('pbmc3k', 'counts')
50.7±0.4ms 50.5±0.3ms 1.00 preprocessing_counts.PreprocessingCountsSuite.time_filter_genes('pbmc3k', 'counts-off-axis')
10.4±0.9ms 10.8±0.8ms 1.04 preprocessing_counts.PreprocessingCountsSuite.time_filter_genes('pbmc68k_reduced', 'counts')
10.5±0.9ms 10.4±1ms 1.00 preprocessing_counts.PreprocessingCountsSuite.time_filter_genes('pbmc68k_reduced', 'counts-off-axis')
2.45±0.1s 2.77±0.1s ~1.13 preprocessing_counts.PreprocessingCountsSuite.time_scrublet('pbmc3k', 'counts')
2.52±0.2s 2.32±0.05s 0.92 preprocessing_counts.PreprocessingCountsSuite.time_scrublet('pbmc3k', 'counts-off-axis')
561±10ms 563±4ms 1.00 preprocessing_counts.PreprocessingCountsSuite.time_scrublet('pbmc68k_reduced', 'counts')
547±20ms 561±10ms 1.03 preprocessing_counts.PreprocessingCountsSuite.time_scrublet('pbmc68k_reduced', 'counts-off-axis')
441M 442M 1.00 preprocessing_log.PreprocessingSuite.peakmem_highly_variable_genes('pbmc3k', 'off-axis')
491M 491M 1.00 preprocessing_log.PreprocessingSuite.peakmem_highly_variable_genes('pbmc3k', None)
296M 296M 1.00 preprocessing_log.PreprocessingSuite.peakmem_highly_variable_genes('pbmc68k_reduced', 'off-axis')
298M 298M 1.00 preprocessing_log.PreprocessingSuite.peakmem_highly_variable_genes('pbmc68k_reduced', None)
569M 569M 1.00 preprocessing_log.PreprocessingSuite.peakmem_pca('pbmc3k', 'off-axis')
588M 588M 1.00 preprocessing_log.PreprocessingSuite.peakmem_pca('pbmc3k', None)
492M 492M 1.00 preprocessing_log.PreprocessingSuite.peakmem_pca('pbmc68k_reduced', 'off-axis')
499M 494M 0.99 preprocessing_log.PreprocessingSuite.peakmem_pca('pbmc68k_reduced', None)
n/a n/a n/a preprocessing_log.PreprocessingSuite.peakmem_regress_out('pbmc3k', 'off-axis')
n/a n/a n/a preprocessing_log.PreprocessingSuite.peakmem_regress_out('pbmc3k', None)
349M 349M 1.00 preprocessing_log.PreprocessingSuite.peakmem_regress_out('pbmc68k_reduced', 'off-axis')
353M 353M 1.00 preprocessing_log.PreprocessingSuite.peakmem_regress_out('pbmc68k_reduced', None)
1.3G 1.3G 1.00 preprocessing_log.PreprocessingSuite.peakmem_scale('pbmc3k', 'off-axis')
1.5G 1.5G 1.00 preprocessing_log.PreprocessingSuite.peakmem_scale('pbmc3k', None)
337M 337M 1.00 preprocessing_log.PreprocessingSuite.peakmem_scale('pbmc68k_reduced', 'off-axis')
344M 339M 0.99 preprocessing_log.PreprocessingSuite.peakmem_scale('pbmc68k_reduced', None)
36.5±0.9ms 35.8±1ms 0.98 preprocessing_log.PreprocessingSuite.time_highly_variable_genes('pbmc3k', 'off-axis')
40.4±4ms 49.5±6ms ~1.22 preprocessing_log.PreprocessingSuite.time_highly_variable_genes('pbmc3k', None)
16.8±0.1ms 16.8±0.3ms 1.00 preprocessing_log.PreprocessingSuite.time_highly_variable_genes('pbmc68k_reduced', 'off-axis')
16.8±0.2ms 16.6±0.2ms 0.99 preprocessing_log.PreprocessingSuite.time_highly_variable_genes('pbmc68k_reduced', None)
1.96±0.01s 1.92±0.02s 0.98 preprocessing_log.PreprocessingSuite.time_pca('pbmc3k', 'off-axis')
2.15±0.01s 2.16±0.01s 1.00 preprocessing_log.PreprocessingSuite.time_pca('pbmc3k', None)
115±50ms 158±20ms ~1.38 preprocessing_log.PreprocessingSuite.time_pca('pbmc68k_reduced', 'off-axis')
164±20ms 70.1±60ms ~0.43 preprocessing_log.PreprocessingSuite.time_pca('pbmc68k_reduced', None)
n/a n/a n/a preprocessing_log.PreprocessingSuite.time_regress_out('pbmc3k', 'off-axis')
n/a n/a n/a preprocessing_log.PreprocessingSuite.time_regress_out('pbmc3k', None)
17.3±0.8ms 17.7±0.8ms 1.02 preprocessing_log.PreprocessingSuite.time_regress_out('pbmc68k_reduced', 'off-axis')
16.8±0.3ms 17.3±0.7ms 1.03 preprocessing_log.PreprocessingSuite.time_regress_out('pbmc68k_reduced', None)
504±3ms 505±3ms 1.00 preprocessing_log.PreprocessingSuite.time_scale('pbmc3k', 'off-axis')
551±2ms 546±1ms 0.99 preprocessing_log.PreprocessingSuite.time_scale('pbmc3k', None)
4.89±0.1ms 4.81±0.2ms 0.98 preprocessing_log.PreprocessingSuite.time_scale('pbmc68k_reduced', 'off-axis')
4.73±0.1ms 4.72±0.07ms 1.00 preprocessing_log.PreprocessingSuite.time_scale('pbmc68k_reduced', None)
287M 288M 1.00 tools.ToolsSuite.peakmem_diffmap
295M 294M 1.00 tools.ToolsSuite.peakmem_leiden
376M 373M 0.99 tools.ToolsSuite.peakmem_rank_genes_groups
452M 453M 1.00 tools.ToolsSuite.peakmem_umap
16.9±0.4ms 15.9±0.4ms 0.94 tools.ToolsSuite.time_diffmap
18.1±0.03ms 18.2±0.04ms 1.01 tools.ToolsSuite.time_leiden
58.8±6ms 60.2±6ms 1.02 tools.ToolsSuite.time_rank_genes_groups
1.02±0s 1.02±0s 1.00 tools.ToolsSuite.time_umap