Skip to content

Fix Dataset.map to preserve attrs set by the applied function#11406

Open
C1-BA-B1-F3 wants to merge 1 commit into
pydata:mainfrom
C1-BA-B1-F3:fix-dataset-map-attrs-update
Open

Fix Dataset.map to preserve attrs set by the applied function#11406
C1-BA-B1-F3 wants to merge 1 commit into
pydata:mainfrom
C1-BA-B1-F3:fix-dataset-map-attrs-update

Conversation

@C1-BA-B1-F3

Copy link
Copy Markdown

Description

This PR fixes #11356 by allowing Dataset.map to preserve attributes that are explicitly set by the applied function.

Problem

Previously, when a function passed to Dataset.map explicitly set attributes on a DataArray (e.g., via .assign_attrs()), those attrs were lost because:

  • With keep_attrs=True: original attrs were copied back, overwriting func's changes
  • With keep_attrs=False: all attrs were wiped to empty dicts

This made it impossible to use Dataset.map to update variable attributes.

Solution

Now, if the function returns a DataArray with non-empty attrs that differ from the original, those attrs are preserved regardless of the keep_attrs setting. This allows users to update variable attributes through Dataset.map:

ds = xr.Dataset({
    'a': ('x', [1, 2, 3], {'units': 'kg'}),
    'b': ('x', [4, 5, 6], {'units': 'kg'}),
})

# Now works correctly - attrs are preserved
result = ds.map(lambda x: (x / x.sum()).assign_attrs(units='unitless'), keep_attrs=True)
assert result['a'].attrs == {'units': 'unitless'}

The existing behavior is preserved for all other cases:

  • keep_attrs=True still restores original attrs when func doesn't change them
  • keep_attrs=False still wipes attrs when func doesn't change them
  • Functions that explicitly drop attrs (via .drop_attrs()) still follow the keep_attrs logic

Tests Added

  • Regression test test_map_preserves_func_attrs covering all four scenarios:
    1. keep_attrs=True with func that sets attrs → func's attrs preserved
    2. keep_attrs=False with func that sets attrs → func's attrs preserved
    3. keep_attrs=True with func that doesn't set attrs → original attrs restored
    4. keep_attrs=False with func that doesn't set attrs → attrs wiped

Fixes #11356

@welcome

welcome Bot commented Jun 25, 2026

Copy link
Copy Markdown

Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient.
If you have questions, some answers may be found in our contributing guidelines.

Previously, when a function passed to Dataset.map explicitly set
attributes on a DataArray (e.g., via .assign_attrs()), those attrs
were lost because:

- With keep_attrs=True: original attrs were copied back, overwriting func's changes
- With keep_attrs=False: all attrs were wiped to empty dicts

Now, if the function returns a DataArray with non-empty attrs that
differ from the original, those attrs are preserved regardless of the
keep_attrs setting. This allows users to update variable attributes
through Dataset.map, which was previously impossible.

Fixes pydata#11356
@C1-BA-B1-F3 C1-BA-B1-F3 force-pushed the fix-dataset-map-attrs-update branch from 10ef567 to 6eecafa Compare June 26, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dataset.map can't update data_var attrs

1 participant