Skip to content

test_microsim_runs fails at 2025/enhanced_cps_2024 with entity-size mismatch #8092

@MaxGhenis

Description

@MaxGhenis

Problem

test_microsim_runs[2025-hf://policyengine/policyengine-us-data/enhanced_cps_2024.h5] fails with an entity-size mismatch:

ValueError: Input [0.000000e+00 9.374027e+04 0.000000e+00 ... 7.718095e-01 0.000000e+00
 0.000000e+00] is not a valid value for the entity household (size = 41314 != 669 = count)

The failure is pre-existing on main — it reproduces by running the test against an unmodified checkout of main (no PR changes). Size 41314 looks like tax-unit-level array count; 669 is the subsampled-1000 household count in the test. Somewhere in the household_net_income dependency chain, a TaxUnit-shaped array is being handed to the household entity population without projection.

How it surfaced

Because PolicyEngine CI runs selective tests by changed-file area, this failure only triggers on PRs that touch the SPM / net-income chain. It was exposed on #8090 (which modified spm_unit_spm_expenses.py), but the root cause pre-dates that PR.

Repro

cd policyengine-us
git checkout main
./.venv/bin/python -m pytest "policyengine_us/tests/microsimulation/test_microsim.py::test_microsim_runs[2025-hf://policyengine/policyengine-us-data/enhanced_cps_2024.h5]" -v

The other three parameter combos (2024-cps_2023, 2024-enhanced_cps_2024, 2025-cps_2023) all pass.

Possible causes

  • A variable in the household_net_income dependency chain is caching or broadcasting a TaxUnit-level array at the household entity when simulation year > dataset year in the enhanced-CPS data preparation.
  • Could be in policyengine-us-data (how enhanced_cps_2024.h5 is built) rather than the model.

Suggested next step

Trace the household_net_income dependency with the enhanced CPS dataset at period=2025 (e.g., set PYTHONBREAKPOINT before check_array_compatible_with_entity and inspect which variable's array has 41314 entries).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions