Skip to content

Fix MultiIndex sel with tuple-valued levels (GH#11341)#11411

Open
C1-BA-B1-F3 wants to merge 1 commit into
pydata:mainfrom
C1-BA-B1-F3:fix-multiindex-nested-tuple-key
Open

Fix MultiIndex sel with tuple-valued levels (GH#11341)#11411
C1-BA-B1-F3 wants to merge 1 commit into
pydata:mainfrom
C1-BA-B1-F3:fix-multiindex-nested-tuple-key

Conversation

@C1-BA-B1-F3

Copy link
Copy Markdown

Problem

When a MultiIndex level contains tuple-valued entries (e.g., (1,1)), selecting with a nested tuple key like ((1,1), 2) incorrectly preserves the dimension instead of collapsing it to a scalar result.

Example from Issue #11341

import xarray as xr
import numpy as np
import pandas as pd

# Create MultiIndex where first level contains tuples
nested_level_0 = pd.Index([(1, 1), (1, 1), (2, 2), (3, 3)], name="a")
nested_level_1 = pd.Index([1, 2, 10, 20], name="b")
nested_mi = pd.MultiIndex.from_arrays([nested_level_0, nested_level_1])

nested = xr.DataArray(
    np.arange(4),
    dims=("index",),
    coords={"index": nested_mi},
)

# This should collapse the dimension (shape=()) but returns shape=(1,)
result = nested.sel(index=((1, 1), 2))

Root Cause

The _is_nested_tuple() function was checking for tuple in addition to list and slice:

def _is_nested_tuple(possible_tuple):
    return isinstance(possible_tuple, tuple) and any(
        isinstance(value, tuple | list | slice) for value in possible_tuple
    )

This caused tuple-valued keys like ((1,1), 2) to be incorrectly identified as nested selection tuples (containing multiple selection values), when they are actually single keys with a tuple-valued first level.

Fix

Remove tuple from the isinstance check so only list and slice are treated as indicators of nested selections:

def _is_nested_tuple(possible_tuple):
    return isinstance(possible_tuple, tuple) and any(
        isinstance(value, list | slice) for value in possible_tuple
    )

This correctly distinguishes:

  • (1, 2) → single key (scalar values) → use get_loc()
  • ((1,1), 2) → single key (tuple value in first level) → use get_loc()
  • ([1,2], 3) → nested selection (list) → use get_locs()
  • (slice(1,2), 3) → nested selection (slice) → use get_locs()

Tests

Added regression test test_sel_nested_tuple_key that verifies:

  1. MultiIndex with tuple-valued levels can be selected with nested tuple keys
  2. The dimension is correctly collapsed (scalar result, not length-1)

Closes #11341

Problem: When a MultiIndex level contains tuple-valued entries (e.g., (1,1)),
selecting with a nested tuple key like ((1,1), 2) incorrectly preserved the
dimension instead of collapsing it to a scalar result.

Root cause: _is_nested_tuple() was checking for 'tuple' in addition to 'list'
and 'slice', which caused it to misidentify tuple-valued keys as nested
selection tuples.

Fix: Remove 'tuple' from the isinstance check in _is_nested_tuple() so that
only 'list' and 'slice' are treated as indicators of nested selections. Tuple-
valued keys in MultiIndex levels are now correctly handled as scalar key values.

Added regression test for selecting with nested tuple keys on MultiIndex with
tuple-valued levels.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

A single nested tuple MultiIndex key is located correctly but preserves the dimension

1 participant