spatial_neigbours extensibility features and clarification#1147
spatial_neigbours extensibility features and clarification#1147selmanozleyen wants to merge 64 commits intoscverse:mainfrom
Conversation
for more information, see https://pre-commit.ci
…leyen/squidpy into feat/spatial_neighbours
for more information, see https://pre-commit.ci
…leyen/squidpy into feat/spatial_neighbours
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1147 +/- ##
==========================================
+ Coverage 73.56% 73.82% +0.26%
==========================================
Files 44 45 +1
Lines 6929 7098 +169
Branches 1174 1178 +4
==========================================
+ Hits 5097 5240 +143
- Misses 1347 1369 +22
- Partials 485 489 +4
🚀 New features to boost your workflow:
|
…leyen/squidpy into feat/spatial_neighbours
for more information, see https://pre-commit.ci
grst
left a comment
There was a problem hiding this comment.
Thanks @selmanozleyen, that goes in the right direction. I put up some design questions up for discussion!
|
|
||
| Notes | ||
| ----- | ||
| ``spatial_neighbors`` has 4 graph-construction modes: |
There was a problem hiding this comment.
Do we have a good explanation of these somewhere?
| "`spatial_neighbors_delaunay`, `spatial_neighbors_grid`, or " | ||
| "`spatial_neighbors_from_builder` instead.", | ||
| FutureWarning, | ||
| stacklevel=2, |
There was a problem hiding this comment.
I think your other future warning was stacklevel=3, how do you decide this?
There was a problem hiding this comment.
stacklevel 2: user code -> spatial_neighbors() -> warnings.warn(...)
vs
stacklevel 3: user code -> spatial_neighbors() -> _resolve_graph_builder() -> warnings.warn(...)
There was a problem hiding this comment.
Do we have this written down somewhere? I haven't really thought about this yet so I wonder if we should add that to some "how to Squidpy" doc somewhere?
| } | ||
| @d.dedent | ||
| def spatial_neighbors_from_builder( | ||
| adata: AnnData | SpatialData, |
|
|
||
|
|
||
| def _prepare_spatial_neighbors_input( | ||
| adata: AnnData | SpatialData, |
| def _prepare_spatial_neighbors_input( | ||
| adata: AnnData | SpatialData, | ||
| *, | ||
| spatial_key: str, |
There was a problem hiding this comment.
Align annotations across functions
| Return D^{-1/2} * A * D^{-1/2}, where D = diag(degrees(A)) and A = adj. | ||
| @d.dedent | ||
| def spatial_neighbors_grid( | ||
| adata: AnnData | SpatialData, |
|
|
||
| def __init__( | ||
| self, | ||
| radius: float | tuple[float, float] | None = None, |
There was a problem hiding this comment.
yes by _filter_by_radius_interval. If I did this from scratch maybe I'd make it more complicated by having this filtering as a separate transform but this feels more appropiate for old implementation
| @@ -0,0 +1,440 @@ | |||
| """Graph construction strategies for spatial neighbor graphs. | |||
There was a problem hiding this comment.
Quite some duplicates code, can you use a factory or prototcol instead?
There was a problem hiding this comment.
Not sure if I agree. What exactly do you think needs deduplication here?
There was a problem hiding this comment.
-
The two typevars here aren't modified anywhere. It could just be an ABC.
-
apply_percentileis the only function using thecoord_type. Since at the point of_resolve_graph_builderwe decide on the downstream path, we don't need that guard. With that guard removed, the coord_type becomes superflous and would further simplify the class. -
Both
apply_filtersare identical, could be deduplicated by moving it intobuildof the ABC. -
KNNBuilder.build_graphandRadiusBuilder.build_graphshare quite some overlap in building their nn-graphs. Just the delauny stuff would need to be routed differently.
but I guess the biggest win would be removing the non-CSR build option? Is this sth we'll realistically ever encounter?
There was a problem hiding this comment.
The factory/protocol comment doesn't make much sense in hind-sight after having gone through it in detail - still, quite some opportunity to remove LOCs
There was a problem hiding this comment.
but I guess the biggest win would be removing the non-CSR build option?
I'd like to also give possibility to support cupy sparse or whatever sparse types jax might have for example
There was a problem hiding this comment.
But I noticed I use scipy operations also in the builder path so let me just have a second look.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
* [pre-commit.ci] pre-commit autoupdate updates: - [github.com/biomejs/pre-commit: v2.4.9 → v2.4.10](biomejs/pre-commit@v2.4.9...v2.4.10) - [github.com/tox-dev/pyproject-fmt: v2.20.0 → v2.21.0](tox-dev/pyproject-fmt@v2.20.0...v2.21.0) - [github.com/astral-sh/ruff-pre-commit: v0.15.8 → v0.15.9](astral-sh/ruff-pre-commit@v0.15.8...v0.15.9) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
shashkat
left a comment
There was a problem hiding this comment.
It looks amazing to me now! Just a few more minor suggestions. I really liked the idea of using specialized Postprocessor classes!
| n_neighs | ||
| Number of neighboring tiles. Defaults to ``6``. | ||
| Number of neighboring tiles used to form the base grid connectivity. | ||
| Defaults to ``6``. On a Visium-like hexagonal grid, ``6`` corresponds to | ||
| the immediate surrounding spots, while smaller values such as ``3`` make | ||
| the first-ring graph deliberately sparser. |
There was a problem hiding this comment.
Maybe it would be useful to mention that there is no guarantee of what subset of neighbors would be chosen from a given ring, if n_neighs is less than the actual number of neighbors of a cell in the grid.
Just like this snippet from the latest-commit documentation of spatial_neighbors function:
- values such as ``n_neighs=4`` and ``n_neighs=6`` are the
intended square-grid and hex-grid choices, respectively;
- other values are accepted for backward compatibility, but
their geometric interpretation is not guaranteed to match a
continuous ring on the grid;
- no clockwise or other within-ring ordering is part of the
public API.
| def postprocessors(self) -> Sequence[GraphPostprocessor[GraphMatrixT]]: | ||
| """Return post-build processing steps for ``(adj, dst)``.""" | ||
| return self._postprocessors |
There was a problem hiding this comment.
I am not sure if this method is offering any functionality.
We can possibly call the attribute _postprocessors as postprocessors and replace line 77 from for postprocessor in self.postprocessors(): to for postprocessor in self.postprocessors:.
What I understand is that this method can allow one to reorder or filter out the entities in _postprocessors attribute. But that can anyways be done when the postprocessors entities are created in the init method of a subclass of GraphBuilder.
| def __init__( | ||
| self, | ||
| transform: str | Transform | None = None, | ||
| set_diag: bool = False, | ||
| percentile: float | None = None, | ||
| postprocessors: Sequence[GraphPostprocessor[GraphMatrixT]] = (), | ||
| ) -> None: | ||
| self.transform = Transform.NONE if transform is None else Transform(transform) | ||
| self.set_diag = set_diag | ||
| self.percentile = percentile | ||
| self._postprocessors: list[GraphPostprocessor[GraphMatrixT]] = list(postprocessors) |
There was a problem hiding this comment.
Is the transform argument used here? I think the transformation requirements would be encoded in postprocessors anyways, so we can remove it.
| if isinstance(radius, tuple): | ||
| postprocessors.append(DistanceIntervalPostprocessor(tuple(sorted(radius)))) |
There was a problem hiding this comment.
If someone passes a single float or int to radius, DelaunayBuilder doesn't do anything with it I believe. Only if radius is a tuple[float, float], does it add the filter. I think we can fix this by doing something like the following.
change this:
if isinstance(radius, tuple):
postprocessors.append(DistanceIntervalPostprocessor(tuple(sorted(radius))))to this:
if isinstance(radius, (int, float)):
radius = (0, radius)
postprocessors.append(DistanceIntervalPostprocessor(tuple(sorted(radius))))Then, we can also change the type suggestion for radius in spatial_neighbors_delaunay to float | tuple[float, float].
| def __init__( | ||
| self, | ||
| radius: float | tuple[float, float], | ||
| transform: str | Transform | None = None, |
There was a problem hiding this comment.
Adding these arguments here adds again some complexity/branching to the RadiusBuilder class, which makes it harder to reason about it and implement it on other backends.
An alternative would be to support these arguments only in the high-level function and instead support chaining of Builders and Postprocessors. Preprocessors would inherit the API from Builders and could be instantiated with a Builder or another already initialized preprocessor. E.g.
transform_interval_builder: Builder = TransformPreprocessor(
DistanceIntervalPostprocessor(RadiusBuilder(...))
)or
transform_interval_builder: Builder = RadiusBuilder(...).chain(
DistanceIntervalPostprocessor
).chain(
TransformPreprocessor
)There was a problem hiding this comment.
I don't have a super strong opinion on this, just as an additional idea.
fixes: #1102 and #1047
It's backward compatible and I am curious what the community might bring to this!