fix: cache scipy flush_only class to fix pickle identity (GH#11323)#11410
Open
C1-BA-B1-F3 wants to merge 2 commits into
Open
fix: cache scipy flush_only class to fix pickle identity (GH#11323)#11410C1-BA-B1-F3 wants to merge 2 commits into
C1-BA-B1-F3 wants to merge 2 commits into
Conversation
…(GH#11323)
The class was defined inside `_open_scipy_netcdf()`,
so each call created a new class object. After opening two scipy-backed
datasets from file-like objects, the first dataset's class reference became
unreachable by qualname, causing pickle's class-identity check to fail with:
PicklingError: Can't pickle
<class 'xarray.backends.scipy_._PickleWorkaround.flush_only_netcdf_file'>:
it's not the same object as
xarray.backends.scipy_._PickleWorkaround.flush_only_netcdf_file
Fix: create the class once in `_get_flush_only_class()`, set its
`__qualname__` to a module-level name, and register it as a module
attribute so pickle can always resolve it.
Regression test included.
…gnment sys.modules[__name__] is typed as ModuleType which doesn't expose module-level annotations. The dynamic assignment is intentional for pickle resolution by qualname.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Opening two scipy-backed datasets from file-like objects (e.g.
BytesIO) and then pickling the first one fails with:Root Cause
flush_only_netcdf_filewas defined as a local class inside_open_scipy_netcdf(). Each call created a new class object. The_PickleWorkaroundclass tried to make it picklable by registering it as a class attribute with a rewritten__qualname__, but after a second open call, the attribute was overwritten with a different class object. Pickle's qualname lookup then returned a different object than the one stored in the first dataset's instance, causing the identity-mismatch error.Fix
_get_flush_only_class()function that caches the class (created once, reused on subsequent calls)__qualname__to a module-level name ("flush_only_netcdf_file")xarray.backends.scipy_.flush_only_netcdf_fileRegression Test
Added
test_pickle_after_multiple_opens_from_bytestoTestScipyInMemoryData— opens two datasets fromBytesIO, then pickles the first one.Closes #11323