Column-wise `dask` `map_blocks`

### What kind of feature would you like to request?

Additional function parameters / changed functionality / changed defaults?

### Please describe your wishes

Almost all of our `dask` functionality relies implicitly or explicitly on row-wise chunking but likely many of the algorithms can be adapted to column-wise chunking.

For example (not a comprehensive list but should highlight the general procedure):

- [x] https://github.com/scverse/scanpy/pull/3700 - **`sc.get.aggregate`** can use `csc` matrices and break up the computation across features (since the computations are independent over features), and then concatenate.
- [ ] **PCA** can likely be done as a multi-pass algorithm over CSC matrices - not efficient but doable.
- [ ] **HVG** in general operates as a feature-space algorithm and, aside from seurat v3, really only relies on a mean-var calculation which is already CSC compatible (even seurat v3 does but in this case, also has this additional `loess` step).  The mark here would be seurat v3/batched HVG selection where row-wise chunking is actually _bad_ for the computation since it requires (likely) random subsets.   In this case, proceeding in a chunked manner i.e., chunk-of-genes by chunk-of-genes probably would not be too bad.  
- [ ] **`top_segment_proportions`** (i.e., from the `percent_top` argument in `calculate_qc_metrics`) could be done in feature-wise chunks as well, and then concatenated at the end

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Column-wise `dask` `map_blocks` #3723

What kind of feature would you like to request?

Please describe your wishes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Column-wise dask map_blocks #3723

Description

What kind of feature would you like to request?

Please describe your wishes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Column-wise `dask` `map_blocks` #3723