Skip to content

fix: cache scipy flush_only class to fix pickle identity (GH#11323)#11410

Open
C1-BA-B1-F3 wants to merge 2 commits into
pydata:mainfrom
C1-BA-B1-F3:fix/scipy-pickle-11323
Open

fix: cache scipy flush_only class to fix pickle identity (GH#11323)#11410
C1-BA-B1-F3 wants to merge 2 commits into
pydata:mainfrom
C1-BA-B1-F3:fix/scipy-pickle-11323

Conversation

@C1-BA-B1-F3

Copy link
Copy Markdown

Problem

Opening two scipy-backed datasets from file-like objects (e.g. BytesIO) and then pickling the first one fails with:

PicklingError: Can't pickle
<class 'xarray.backends.scipy_._PickleWorkaround.flush_only_netcdf_file'>:
it's not the same object as
xarray.backends.scipy_._PickleWorkaround.flush_only_netcdf_file

Root Cause

flush_only_netcdf_file was defined as a local class inside _open_scipy_netcdf(). Each call created a new class object. The _PickleWorkaround class tried to make it picklable by registering it as a class attribute with a rewritten __qualname__, but after a second open call, the attribute was overwritten with a different class object. Pickle's qualname lookup then returned a different object than the one stored in the first dataset's instance, causing the identity-mismatch error.

Fix

  • Move the class creation to a dedicated _get_flush_only_class() function that caches the class (created once, reused on subsequent calls)
  • Set __qualname__ to a module-level name ("flush_only_netcdf_file")
  • Register the class as a module attribute so pickle can always resolve it via xarray.backends.scipy_.flush_only_netcdf_file

Regression Test

Added test_pickle_after_multiple_opens_from_bytes to TestScipyInMemoryData — opens two datasets from BytesIO, then pickles the first one.

Closes #11323

…(GH#11323)

The  class was defined inside `_open_scipy_netcdf()`,
so each call created a new class object. After opening two scipy-backed
datasets from file-like objects, the first dataset's class reference became
unreachable by qualname, causing pickle's class-identity check to fail with:

    PicklingError: Can't pickle
    <class 'xarray.backends.scipy_._PickleWorkaround.flush_only_netcdf_file'>:
    it's not the same object as
    xarray.backends.scipy_._PickleWorkaround.flush_only_netcdf_file

Fix: create the class once in `_get_flush_only_class()`, set its
`__qualname__` to a module-level name, and register it as a module
attribute so pickle can always resolve it.

Regression test included.
…gnment

sys.modules[__name__] is typed as ModuleType which doesn't expose
module-level annotations. The dynamic assignment is intentional for
pickle resolution by qualname.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2026.4.0 breaks pickling with backends.scipy_

1 participant