[GH-2407] Auto-detect raster columns in SedonaUtils.display_image#2633
Merged
[GH-2407] Auto-detect raster columns in SedonaUtils.display_image#2633
Conversation
9152305 to
031e9cb
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Adds auto-detection of raster columns in SedonaUtils.display_image() to avoid hangs when rendering raw raster DataFrames in notebooks, and updates docs to recommend the new workflow.
Changes:
- Auto-detects raster UDT columns (
typeName() == "rastertype") and appliesRS_AsImage()before rendering. - Preserves non-raster columns while converting raster columns to HTML image output.
- Updates raster documentation to show direct Jupyter rendering via
SedonaUtils.display_image().
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| python/sedona/spark/raster_utils/SedonaUtils.py | Adds raster-column detection and auto-RS_AsImage() projection inside display_image() |
| docs/tutorial/raster.md | Adds tips recommending SedonaUtils.display_image() for quick Jupyter visualization |
| docs/api/sql/Raster-visualizer.md | Expands Jupyter visualization section and documents direct raster display workflow |
| docs/api/sql/Raster-operators.md | Adds tip pointing to SedonaUtils.display_image() for quick visualization |
| docs/api/sql/Raster-map-algebra.md | Adds tip for inspecting map algebra results via display_image() |
| docs/api/sql/Raster-loader.md | Adds tip for visualizing loaded rasters via display_image() |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
031e9cb to
a9dc13c
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
SedonaUtils.display_image() was slow for raster DataFrames because it routed through SedonaMapUtils.__convert_to_gdf_or_pdf__(), which performs Arrow conversion, geopandas import attempts, and DataFrame-to-HTML-table wrapping — all unnecessary when the input is already HTML <img> strings from RS_AsImage(). - Add fast path: collect rows directly and render HTML <img> strings without intermediate Arrow/Pandas/to_html() conversion - Keep fallback to original path for non-image DataFrames - Add docstring to display_image()
a9dc13c to
94fa858
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Did you read the Contributor Guide?
Is this PR related to a ticket?
[GH-XXX] my subject. Closes [Python] Rendering image in jupyter notebook using SedonaUtils, display_image takes long time #2407What changes were proposed in this PR?
SedonaUtils.display_image()hangs when passed a raw raster DataFrame (e.g., 1400x800) because__convert_to_gdf_or_pdf__attempts to Arrow-serialize the full GridCoverage2D object.This PR adds raster UDT detection to
display_image():typeName() == "rastertype"RS_AsImage()to each raster column before rendering__convert_to_gdf_or_pdf__+to_html()path after conversionUsers can now pass a raw raster DataUsers can now pass a raw raster DataUsers can now pass a raw raster DataUsers this patch tested?
RS_AsImage()DataFrame (regression check)test_sedonautils.py::test_display_imagecontinues to passDid this PR include necessary documentation updates?
docs/api/sql/Raster-visualizer.mdto show both the new direct raster display (recommended) and the explicitRS_AsImageworkflow.