perf: cap PyTorch threads, bump fix_illumination forks, skip intermediate pyramids by FIrgolitsch · Pull Request #113 · linum-uqam/linumpy

FIrgolitsch · 2026-04-30T02:49:15Z

Stacked PR 14/22 — review order: #115 → #97 → #98 → #99 → #100 → #101 → #108 → #106 → #107 → #87 → #116 → #110 → #111 → #40 → #112 → #113 → #117 → #118 → #120 → #121 → #122 → #123 → #124 → #125

Base: pr-m-gpu-kvikio. Retargets to main as upstream PRs merge.

PR — Pipeline performance tuning

Three measurement-driven perf improvements to the reconst_3d Nextflow pipeline:

fix_illumination thread cap. linum_fix_illumination_3d.py now calls configure_all_libraries() to cap PyTorch threads, preventing oversubscription when running multiple forks.
fix_illumination maxForks 1 → 4. Measured: BaSiCPy/PyTorch uses ~374 MiB per fork; 4 forks ≈ 1.5 GB, well under per-GPU capacity.
Skip pyramids on intermediate ome-zarr outputs. --n_levels 0 on intermediate steps avoids wasted multiscale generation that's only needed on final outputs.

Co-authored-by: Copilot <copilot@github.com>

Squash of resample/GPU/workflow work on dev: - Round-robin GPU assignment for resample_mosaic_grid (linumpy.gpu) - Per-gpu active-slot counters via Helpers.gpuPinBlock for fair maxForks balance - CUDA device pin in prefetch worker thread + workflow CUDA_VISIBLE_DEVICES - Keeps simple per-tile CPU-read -> GPU-rescale -> CPU-write path

Heavy downsampling of small axes (e.g. Z=5 by factor 0.1) used to round to zero, producing zero-sized output and ZeroDivisionError downstream. Clamp output_shape to a minimum of 1 in both rescale() and out_tile_shape.

FIrgolitsch force-pushed the pr-m-gpu-kvikio branch from 7544bf5 to 95f24bc Compare April 30, 2026 03:21

FIrgolitsch force-pushed the pr-n-perf branch from 61cd50e to 85cfd15 Compare April 30, 2026 03:21

FIrgolitsch force-pushed the pr-m-gpu-kvikio branch from 95f24bc to 14bfd65 Compare April 30, 2026 03:26

FIrgolitsch force-pushed the pr-n-perf branch from 85cfd15 to ba38393 Compare April 30, 2026 03:26

FIrgolitsch force-pushed the pr-m-gpu-kvikio branch from 14bfd65 to ab73c15 Compare April 30, 2026 03:51

FIrgolitsch force-pushed the pr-n-perf branch from ba38393 to 5131c71 Compare April 30, 2026 03:51

FIrgolitsch mentioned this pull request May 1, 2026

fix(galvo): per-tile detect + threaded column strips + --skip_tiles for manual overrides #117

Open

FIrgolitsch force-pushed the pr-m-gpu-kvikio branch from ab73c15 to a6f274b Compare May 1, 2026 17:20

FIrgolitsch force-pushed the pr-n-perf branch from 366f176 to 0fbd70e Compare May 1, 2026 17:20

FIrgolitsch mentioned this pull request May 1, 2026

chore: bump python floor to 3.14, modernize CI/build, ruff FURB sweep #118

Open

FIrgolitsch and others added 4 commits May 20, 2026 12:19

perf(fix_illum): cap PyTorch threads via configure_all_libraries

92cdd6d

perf(pipeline): bump fix_illumination maxForks 1->4 based on measurement

af46952

perf(pipeline): skip pyramids on intermediate ome-zarr outputs

3fa4157

Docs update

e1a56f9

Co-authored-by: Copilot <copilot@github.com>

FIrgolitsch added 2 commits May 20, 2026 12:19

FIrgolitsch force-pushed the pr-m-gpu-kvikio branch from a6f274b to f6c23e5 Compare May 20, 2026 16:23

FIrgolitsch force-pushed the pr-n-perf branch from 0fbd70e to ab82884 Compare May 20, 2026 16:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: cap PyTorch threads, bump fix_illumination forks, skip intermediate pyramids#113

perf: cap PyTorch threads, bump fix_illumination forks, skip intermediate pyramids#113
FIrgolitsch wants to merge 6 commits into
pr-m-gpu-kvikiofrom
pr-n-perf

FIrgolitsch commented Apr 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FIrgolitsch commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR — Pipeline performance tuning

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FIrgolitsch commented Apr 30, 2026 •

edited

Loading