Skip to content

xarray.load_dataarray fails when loading a DataArray with coordinates via zarr-fsspec #10950

@csubich

Description

@csubich

What happened?

When loading a zarr-backed DataArray via a fsspec URL, if the DataArray has coordinates xarray appears to treat the load as a request for a Dataset, not a DataArray. It then seeks to load the coordinate as a distinct variable within the file, where it is not present.

This issue does not occur when loading a DirectoryStore-backed zarr; the coordinate dimension is loaded successfully.

What did you expect to happen?

Behaviour should be identical between DirectoryStore backed DataArrays and FSSpec-backed DirectoryArrays, and both should support arrays with coordinates.

Minimal Complete Verifiable Example

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "xarray[complete]@git+https://github.com/pydata/xarray.git@main",
# ]
# ///
#
# This script automatically imports the development branch of xarray to check for issues.
# Please delete this header if you have _not_ tested this script with `uv run`!

import xarray as xr
xr.show_versions()
# your reproducer code ...

import xarray as xr
import numpy as np
import os
os.system('rm -rf good.zipstore.zip')
os.system('rm -rf bad.zipstore.zip')
os.system('rm -rf good.dirstore.zarr')
os.system('rm -rf bad.dirstore.zarr')

## Working example, no coordinates
foo = xr.DataArray(data=np.zeros(1),
             dims='foo')
             # coords={'foo' : [0]})

foo.to_zarr('good.dirstore.zarr',consolidated=False,zarr_format=3).close()
# Make zipstore
os.system('cd good.dirstore.zarr; zip -0rv ../good.zipstore.zip ./')

foo_dirstore = xr.load_dataarray('good.dirstore.zarr',engine='zarr',zarr_format=3,consolidated=False)
print(foo_dirstore)
# <xarray.DataArray (foo: 1)> Size: 8B
# array([0.])
# Dimensions without coordinates: foo

foo_zipstore = xr.load_dataarray('zip:///::good.zipstore.zip',engine='zarr',zarr_format=3,consolidated=False)
print(foo_zipstore)
# <xarray.DataArray (foo: 1)> Size: 8B
# array([0.])
# Dimensions without coordinates: foo

## Broken example, with coordinates
foo = xr.DataArray(data=np.zeros(1),
             dims='foo',
             coords={'foo' : [0]})

foo.to_zarr('bad.dirstore.zarr',consolidated=False,zarr_format=3).close()

# Make zipstore
os.system('cd bad.dirstore.zarr; zip -0rv ../bad.zipstore.zip ./')

foo_dirstore = xr.load_dataarray('bad.dirstore.zarr',engine='zarr',zarr_format=3,consolidated=False)
print(foo_dirstore)
# <xarray.DataArray (foo: 1)> Size: 8B
# array([0.])
# Coordinates:
#   * foo      (foo) int64 8B 0

foo_zipstore = xr.load_dataarray('zip:///::bad.zipstore.zip',engine='zarr',zarr_format=3,consolidated=False)
print(foo_zipstore)
# KeyError                                  Traceback (most recent call last)
# /tmp/ipython-input-2393741926.py in <cell line: 0>()
# ----> 1 foo_zipstore = xr.load_dataarray('zip:///::bad.zipstore.zip',engine='zarr',zarr_format=3,consolidated=False)
#       2 print(foo_zipstore)

# ...
# /usr/lib/python3.12/zipfile/__init__.py in getinfo(self, name)
#    1547         info = self.NameToInfo.get(name)
#    1548         if info is None:
# -> 1549             raise KeyError(
#    1550                 'There is no item named %r in the archive' % name)
#    1551 

# KeyError: "There is no item named 'foo/c/0' in the archive"

Steps to reproduce

See MCVE: write a DataArray containing coordinates to a DirectoryStore, zip it to a ZipStore, and open the file via a fsspec zip:// URL. MCVE tested on google colab.

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

<xarray.DataArray (foo: 1)> Size: 8B
array([0.])
Dimensions without coordinates: foo
<xarray.DataArray (foo: 1)> Size: 8B
array([0.])
Dimensions without coordinates: foo
<xarray.DataArray (foo: 1)> Size: 8B
array([0.])
Coordinates:
  * foo      (foo) int64 8B 0
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/tmp/ipython-input-3205390801.py in <cell line: 0>()
     45 #   * foo      (foo) int64 8B 0
     46 
---> 47 foo_zipstore = xr.load_dataarray('zip:///::bad.zipstore.zip',engine='zarr',zarr_format=3,consolidated=False)
     48 print(foo_zipstore)
     49 # KeyError                                  Traceback (most recent call last)

35 frames
/usr/local/lib/python3.12/dist-packages/xarray/backends/api.py in load_dataarray(filename_or_obj, **kwargs)
    189         raise TypeError("cache has no effect in this context")
    190 
--> 191     with open_dataarray(filename_or_obj, **kwargs) as da:
    192         return da.load()
    193 

/usr/local/lib/python3.12/dist-packages/xarray/backends/api.py in open_dataarray(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, create_default_indexes, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
    811     """
    812 
--> 813     dataset = open_dataset(
    814         filename_or_obj,
    815         decode_cf=decode_cf,

/usr/local/lib/python3.12/dist-packages/xarray/backends/api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, create_default_indexes, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
    610         **kwargs,
    611     )
--> 612     ds = _dataset_from_backend_dataset(
    613         backend_ds,
    614         filename_or_obj,

/usr/local/lib/python3.12/dist-packages/xarray/backends/api.py in _dataset_from_backend_dataset(backend_ds, filename_or_obj, engine, chunks, cache, overwrite_encoded_chunks, inline_array, chunked_array_type, from_array_kwargs, create_default_indexes, **extra_tokens)
    300 
    301     if create_default_indexes:
--> 302         ds = _maybe_create_default_indexes(backend_ds)
    303     else:
    304         ds = backend_ds

/usr/local/lib/python3.12/dist-packages/xarray/backends/api.py in _maybe_create_default_indexes(ds)
    276         if coord.dims == (name,) and name not in ds.xindexes
    277     }
--> 278     return ds.assign_coords(Coordinates(to_index))
    279 
    280 

/usr/local/lib/python3.12/dist-packages/xarray/core/coordinates.py in __init__(self, coords, indexes)
    313                 var = as_variable(data, name=name, auto_convert=False)
    314                 if var.dims == (name,) and indexes is None:
--> 315                     index, index_vars = create_default_index_implicit(var, list(coords))
    316                     default_indexes.update(dict.fromkeys(index_vars, index))
    317                     variables.update(index_vars)

/usr/local/lib/python3.12/dist-packages/xarray/core/indexes.py in create_default_index_implicit(dim_variable, all_variables)
   1636     else:
   1637         dim_var = {name: dim_variable}
-> 1638         index = PandasIndex.from_variables(dim_var, options={})
   1639         index_vars = index.create_variables(dim_var)
   1640 

/usr/local/lib/python3.12/dist-packages/xarray/core/indexes.py in from_variables(cls, variables, options)
    718         # preserve wrapped pd.Index (if any)
    719         # accessing `.data` can load data from disk, so we only access if needed
--> 720         data = var._data if isinstance(var._data, PandasIndexingAdapter) else var.data  # type: ignore[redundant-expr]
    721         # multi-index level variable: get level index
    722         if isinstance(var._data, PandasMultiIndexingAdapter):

/usr/local/lib/python3.12/dist-packages/xarray/core/variable.py in data(self)
    454             duck_array = self._data.array
    455         elif isinstance(self._data, indexing.ExplicitlyIndexed):
--> 456             duck_array = self._data.get_duck_array()
    457         elif is_duck_array(self._data):
    458             duck_array = self._data

/usr/local/lib/python3.12/dist-packages/xarray/core/indexing.py in get_duck_array(self)
    941 
    942     def get_duck_array(self):
--> 943         duck_array = self.array.get_duck_array()
    944         # ensure the array object is cached in-memory
    945         self.array = as_indexable(duck_array)

/usr/local/lib/python3.12/dist-packages/xarray/core/indexing.py in get_duck_array(self)
    895 
    896     def get_duck_array(self):
--> 897         return self.array.get_duck_array()
    898 
    899     async def async_get_duck_array(self):

/usr/local/lib/python3.12/dist-packages/xarray/core/indexing.py in get_duck_array(self)
    735 
    736         if isinstance(self.array, BackendArray):
--> 737             array = self.array[self.key]
    738         else:
    739             array = apply_indexer(self.array, self.key)

/usr/local/lib/python3.12/dist-packages/xarray/backends/zarr.py in __getitem__(self, key)
    260         elif isinstance(key, indexing.OuterIndexer):
    261             method = self._oindex
--> 262         return indexing.explicit_indexing_adapter(
    263             key, array.shape, indexing.IndexingSupport.VECTORIZED, method
    264         )

/usr/local/lib/python3.12/dist-packages/xarray/core/indexing.py in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method)
   1127     """
   1128     raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
-> 1129     result = raw_indexing_method(raw_key.tuple)
   1130     if numpy_indices.tuple:
   1131         # index the loaded duck array

/usr/local/lib/python3.12/dist-packages/xarray/backends/zarr.py in _getitem(self, key)
    223 
    224     def _getitem(self, key):
--> 225         return self._array[key]
    226 
    227     async def _async_getitem(self, key):

/usr/local/lib/python3.12/dist-packages/zarr/core/array.py in __getitem__(self, selection)
   2866             return self.vindex[cast("CoordinateSelection | MaskSelection", selection)]
   2867         elif is_pure_orthogonal_indexing(pure_selection, self.ndim):
-> 2868             return self.get_orthogonal_selection(pure_selection, fields=fields)
   2869         else:
   2870             return self.get_basic_selection(cast("BasicSelection", pure_selection), fields=fields)

/usr/local/lib/python3.12/dist-packages/zarr/core/array.py in get_orthogonal_selection(self, selection, out, fields, prototype)
   3337             prototype = default_buffer_prototype()
   3338         indexer = OrthogonalIndexer(selection, self.shape, self.metadata.chunk_grid)
-> 3339         return sync(
   3340             self.async_array._get_selection(
   3341                 indexer=indexer, out=out, fields=fields, prototype=prototype

/usr/local/lib/python3.12/dist-packages/zarr/core/sync.py in sync(coro, loop, timeout)
    157 
    158     if isinstance(return_result, BaseException):
--> 159         raise return_result
    160     else:
    161         return return_result

/usr/local/lib/python3.12/dist-packages/zarr/core/sync.py in _runner(coro)
    117     """
    118     try:
--> 119         return await coro
    120     except Exception as ex:
    121         return ex

/usr/local/lib/python3.12/dist-packages/zarr/core/array.py in _get_selection(self, indexer, prototype, out, fields)
   1563 
   1564             # reading chunks and decoding them
-> 1565             await self.codec_pipeline.read(
   1566                 [
   1567                     (

/usr/local/lib/python3.12/dist-packages/zarr/core/codec_pipeline.py in read(self, batch_info, out, drop_axes)
    471         drop_axes: tuple[int, ...] = (),
    472     ) -> None:
--> 473         await concurrent_map(
    474             [
    475                 (single_batch_info, out, drop_axes)

/usr/local/lib/python3.12/dist-packages/zarr/core/common.py in concurrent_map(items, func, limit)
    114                 return await func(*item)
    115 
--> 116         return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items])
    117 
    118 

/usr/local/lib/python3.12/dist-packages/zarr/core/common.py in run(item)
    112         async def run(item: tuple[Any]) -> V:
    113             async with sem:
--> 114                 return await func(*item)
    115 
    116         return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items])

/usr/local/lib/python3.12/dist-packages/zarr/core/codec_pipeline.py in read_batch(self, batch_info, out, drop_axes)
    268                     out[out_selection] = fill_value_or_default(chunk_spec)
    269         else:
--> 270             chunk_bytes_batch = await concurrent_map(
    271                 [(byte_getter, array_spec.prototype) for byte_getter, array_spec, *_ in batch_info],
    272                 lambda byte_getter, prototype: byte_getter.get(prototype),

/usr/local/lib/python3.12/dist-packages/zarr/core/common.py in concurrent_map(items, func, limit)
    114                 return await func(*item)
    115 
--> 116         return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items])
    117 
    118 

/usr/local/lib/python3.12/dist-packages/zarr/core/common.py in run(item)
    112         async def run(item: tuple[Any]) -> V:
    113             async with sem:
--> 114                 return await func(*item)
    115 
    116         return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items])

/usr/local/lib/python3.12/dist-packages/zarr/storage/_common.py in get(self, prototype, byte_range)
    166         if prototype is None:
    167             prototype = default_buffer_prototype()
--> 168         return await self.store.get(self.path, prototype=prototype, byte_range=byte_range)
    169 
    170     async def set(self, value: Buffer) -> None:

/usr/local/lib/python3.12/dist-packages/zarr/storage/_fsspec.py in get(self, key, prototype, byte_range)
    287         try:
    288             if byte_range is None:
--> 289                 value = prototype.buffer.from_bytes(await self.fs._cat_file(path))
    290             elif isinstance(byte_range, RangeByteRequest):
    291                 value = prototype.buffer.from_bytes(

/usr/local/lib/python3.12/dist-packages/fsspec/implementations/asyn_wrapper.py in wrapper(*args, **kwargs)
     25     @functools.wraps(func)
     26     async def wrapper(*args, **kwargs):
---> 27         return await asyncio.to_thread(func, *args, **kwargs)
     28 
     29     return wrapper

/usr/lib/python3.12/asyncio/threads.py in to_thread(func, *args, **kwargs)
     23     ctx = contextvars.copy_context()
     24     func_call = functools.partial(ctx.run, func, *args, **kwargs)
---> 25     return await loop.run_in_executor(None, func_call)

/usr/lib/python3.12/concurrent/futures/thread.py in run(self)
     57 
     58         try:
---> 59             result = self.fn(*self.args, **self.kwargs)
     60         except BaseException as exc:
     61             self.future.set_exception(exc)

/usr/local/lib/python3.12/dist-packages/fsspec/spec.py in cat_file(self, path, start, end, **kwargs)
    767         """
    768         # explicitly set buffering off?
--> 769         with self.open(path, "rb", **kwargs) as f:
    770             if start is not None:
    771                 if start >= 0:

/usr/local/lib/python3.12/dist-packages/fsspec/spec.py in open(self, path, mode, block_size, cache_options, compression, **kwargs)
   1308         else:
   1309             ac = kwargs.pop("autocommit", not self._intrans)
-> 1310             f = self._open(
   1311                 path,
   1312                 mode=mode,

/usr/local/lib/python3.12/dist-packages/fsspec/implementations/zip.py in _open(self, path, mode, block_size, autocommit, cache_options, **kwargs)
    128         if "r" in self.mode and "w" in mode:
    129             raise OSError("ZipFS can only be open for reading or writing, not both")
--> 130         out = self.zip.open(path, mode.strip("b"), force_zip64=self.force_zip_64)
    131         if "r" in mode:
    132             info = self.info(path)

/usr/lib/python3.12/zipfile/__init__.py in open(self, name, mode, pwd, force_zip64)
   1619         else:
   1620             # Get info object for name
-> 1621             zinfo = self.getinfo(name)
   1622 
   1623         if mode == 'w':

/usr/lib/python3.12/zipfile/__init__.py in getinfo(self, name)
   1547         info = self.NameToInfo.get(name)
   1548         if info is None:
-> 1549             raise KeyError(
   1550                 'There is no item named %r in the archive' % name)
   1551 

KeyError: "There is no item named 'foo/c/0' in the archive"

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.12.12 (main, Oct 10 2025, 08:52:57) [GCC 11.4.0]
python-bits: 64
OS: Linux
OS-release: 6.6.105+
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.6
libnetcdf: 4.9.3

xarray: 2025.11.1.dev8+g6e82a3afa
pandas: 2.2.2
numpy: 2.0.2
scipy: 1.16.3
netCDF4: 1.7.3
pydap: 3.5.8
h5netcdf: 1.7.3
h5py: 3.15.1
zarr: 3.1.5
cftime: 1.6.5
nc_time_axis: 1.4.1
iris: None
bottleneck: 1.4.2
dask: 2025.9.1
distributed: 2025.9.1
matplotlib: 3.10.0
cartopy: 0.25.0
seaborn: 0.13.2
numbagg: 0.9.3
fsspec: 2025.3.0
cupy: 13.6.0
pint: None
sparse: 0.17.0
flox: 0.10.7
numpy_groupies: 0.11.3
setuptools: 75.2.0
pip: 24.1.2
conda: None
pytest: 8.4.2
mypy: None
IPython: 7.34.0
sphinx: 8.2.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugneeds triageIssue that has not been reviewed by xarray team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions