Refresh geotiff mmap cache when the file at a path is replaced#1506
Open
brendancol wants to merge 1 commit intomainfrom
Open
Refresh geotiff mmap cache when the file at a path is replaced#1506brendancol wants to merge 1 commit intomainfrom
brendancol wants to merge 1 commit intomainfrom
Conversation
The shared `_MmapCache` keyed entries by realpath and never re-stat'd after the first open. `to_geotiff` writes via `tempfile.mkstemp` + `os.replace`, which keeps the path stable but swaps the inode. A process that read a TIFF, overwrote it via `to_geotiff`, then read again would get bytes from the unlinked old file -- the cache served the prior mmap because the realpath still matched. Same hazard hits in-place truncate-and-rewrite from any external tool. Cache entries now carry `(st_ino, st_size, st_mtime_ns)`; `acquire` stat's the path and drops the cached entry when the ident has changed. Idle entries are closed; in-use mmaps stay valid for their current holders (they still own a reference) but are unlinked from the cache so subsequent `acquire` calls re-mmap the new file. Tests cover atomic-rename replacement and in-place truncate overwrite.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_MmapCachekeys entries by realpath and never re-stat'd after the first open. After an atomic-rename overwrite (which is whatto_geotiffdoes viatempfile.mkstemp+os.replace) a re-read served bytes from the unlinked old file because the realpath still matched the cache key. Same hazard hits in-place truncate-and-rewrite from external tools.(st_ino, st_size, st_mtime_ns);acquirestats the path and drops the cached entry when the ident has changed. Idle entries get closed; in-use mmaps stay valid for their current holders (they still own a reference) but are unlinked from the cache so subsequentacquirecalls re-mmap the new file.Test plan
pytest xrspatial/geotiff/tests/test_polish_1488.py::TestP3MmapLRUpasses (3 prior + 2 new).pytest xrspatial/geotiff/tests/passes (the 3 pre-existingTestPalettefailures are unrelated to the cache and fail on origin/main too).read_geotiff_gpuround-trip.