Fill sparse COG tiles with nodata instead of crashing#1501
Open
brendancol wants to merge 1 commit intomainfrom
Open
Fill sparse COG tiles with nodata instead of crashing#1501brendancol wants to merge 1 commit intomainfrom
brendancol wants to merge 1 commit intomainfrom
Conversation
GDAL's SPARSE_OK=TRUE writes blocks consisting entirely of nodata with TileByteCounts (or StripByteCounts) == 0 and a matching offset of 0. The local mmap reader tried to decompress the empty range and raised "incomplete or truncated stream"; the COG HTTP reader skipped the fetch but left the result allocated as np.empty, returning whatever happened to be in memory; the GPU reader crashed somewhere in the decode pipeline. Detect sparse entries up front, allocate the result with np.full(fill) where fill is the file's GDAL_NODATA value (or 0 when unset), and skip the decode for those entries. The GPU dispatcher routes sparse files through the CPU reader and copies the result to device memory, since the on-GPU pipeline does not handle empty tile ranges. Tests cover tiled and stripped layouts with and without a nodata tag, the raw read_to_array path (sparse becomes the sentinel value), the public open_geotiff path (sparse becomes NaN via the existing nodata promotion), and the GPU path.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
GDAL's
SPARSE_OK=TRUEwrites blocks consisting entirely of nodata withTileByteCounts(orStripByteCounts)== 0and a matching offset of 0. The reader handled this badly:error -5 while decompressing data: incomplete or truncated stream.np.empty, so sparse regions returned uninitialised memory.This branch detects sparse blocks before decode, pre-fills the result with the file's
GDAL_NODATAvalue (zero when no nodata tag is present, matching GDAL's convention), and skips decode for those entries. The GPU dispatcher routes sparse files through the CPU reader and copies the result onto the device, since the GPU pipeline does not handle empty ranges.Repro on
mainAfter the fix the call returns a
(128, 128)float64 DataArray with the filled tile and NaN over the sparse region.Test plan
pytest xrspatial/geotiff/tests/test_sparse_cog.py(5 new tests covering tiled/stripped, with/without nodata, raw + accessor reads, and GPU)pytest xrspatial/geotiff/tests/full suite (651 passed, 7 deselected — 3 pre-existing matplotlib paletteRecursionErrorfailures unrelated to this PR, also fail onorigin/main)SPARSE_OK=TRUEfiles for tiled and stripped layouts and CuPy GPU read