Fix TCTracks.from_FAST duplicate loading from year loop by simonameiler · Pull Request #1269 · CLIMADA-project/climada_python

simonameiler · 2026-03-23T20:52:33Z

Changes proposed in this PR:
This PR fixes duplicate track creation in TCTracks.from_FAST for FAST NetCDF files containing both n_trk and year dimensions.

Root cause:
from_FAST iterated over both dataset.year and dataset.n_trk, then selected dataset.sel(n_trk=i, year=year).
For FAST files where track variables are indexed by n_trk and time, this repeats each track once per year index.

Changes:
In climada/hazard/tc_tracks.py:
removed outer loop over dataset.year
iterate only over dataset.n_trk
select tracks with dataset.sel(n_trk=i)

Tests:
Added regression test test_from_FAST_not_multiplied_by_year_dim in climada/hazard/test/test_tc_tracks.py.
Test builds a tiny temporary FAST-like NetCDF with both n_trk and year dims and checks that output size equals n_trk (not n_trk * n_year).
Validation

Ran:
python -m unittest climada.hazard.test.test_tc_tracks.TestIO.test_from_FAST
python -m unittest climada.hazard.test.test_tc_tracks.TestIO.test_from_FAST_not_multiplied_by_year_dim
Both passed locally.

This PR fixes #1268

PR Author Checklist

PR Reviewer Checklist

NicolasColombi

Hi Simona, thanks for catching this and proposing to fix it! 🙌

As mentioned in the review, my only remark is on the test file, I believe it would be less cumbersome to have a single file and single test.

NicolasColombi · 2026-03-24T10:20:12Z

climada/hazard/test/test_tc_tracks.py

+        with tempfile.TemporaryDirectory() as tmpdir:
+            ds = xr.Dataset(
+                {
+                    "lon_trks": (
+                        ("n_trk", "time"),
+                        np.array(
+                            [
+                                [290.0, 291.0, 292.0],
+                                [300.0, 301.0, 302.0],
+                            ],
+                            dtype=float,
+                        ),
+                    ),
+                    "lat_trks": (
+                        ("n_trk", "time"),
+                        np.array(
+                            [
+                                [10.0, 10.5, 11.0],
+                                [15.0, 15.5, 16.0],
+                            ],
+                            dtype=float,
+                        ),
+                    ),
+                    "vmax_trks": (
+                        ("n_trk", "time"),
+                        np.array(
+                            [
+                                [20.0, 21.0, 22.0],
+                                [25.0, 26.0, 27.0],
+                            ],
+                            dtype=float,
+                        ),
+                    ),
+                    "tc_month": ("n_trk", np.array([8, 9], dtype=np.int64)),
+                    "tc_basins": ("n_trk", np.array(["NA", "NA"], dtype="<U2")),
+                    "tc_years": ("n_trk", np.array([1998, 1999], dtype=np.int64)),
+                    "seeds_per_month": (
+                        ("year", "basin", "month"),
+                        np.zeros((4, 1, 12), dtype=float),
+                    ),
+                },
+                coords={
+                    "n_trk": ("n_trk", np.array([0, 1], dtype=np.int64)),
+                    "time": ("time", np.array([0, 10800, 21600], dtype=float)),
+                    "year": (
+                        "year",
+                        np.array([1998, 1999, 2000, 2001], dtype=np.int64),
+                    ),
+                    "basin": ("basin", np.array(["NA"], dtype="<U2")),
+                    "month": ("month", np.arange(1, 13, dtype=np.int64)),
+                },
+            )
+
+            path = DATA_DIR.joinpath(tmpdir, "fast_regression.nc")
+            ds.to_netcdf(path)


Since there is already a test file named FAST_test_tracks.nc, with only one year (hence the test did not reveled the bug you are suggesting to fix), it might be a good idea to update such file to contain 2 years, and then test the updated from_FAST function on such file with only one test. I do not think this would be an issue size wise, as it is only 62KB at the moment. This would require, either reproducing the small file for two years, or simply fabricating such file by duplicating the existing one, concatenate it, and manually modify the second year number. Lastly, you will need to update the original test to capture this.

This way I think we can have a single file and single test, reducing the code, since the majority of the code in your test is there to create a temporary file.

Thanks for the suggestion, this makes a lot of sense.

I’ve updated the existing fixture file (FAST_test_tracks.nc) to include two years (year = [2025, 2026]) by duplicating the seeds_per_month data along the year dimension. The track-related variables (tc_years, lon_trks, etc.) remain unchanged, so the file structure stays consistent with the original intent.

With this update, I removed the separate regression test and its temporary-file setup. The existing assertion in test_from_FAST:

self.assertEqual(len(tc_track.data), 5)

now ensures that tracks are not duplicated when a year dimension is present. With two years present, the previous buggy implementation would have returned 10 tracks instead of 5.

This keeps the test setup simpler and avoids duplicating logic for temporary file creation.

…egression test The test fixture FAST_test_tracks.nc now has year=[2025,2026] (only seeds_per_month is extended; track variables retain their n_trk dim). The existing len(tc_track.data)==5 assertion now acts as the regression check: the buggy year-loop code would return 5x2=10 tracks. The separate test_from_FAST_not_multiplied_by_year_dim (with its temporary-file scaffolding) is removed.

NicolasColombi

Thanks for fixing this Simona! 🙌 This is ready to merge from my side.

@chahank do we have your blessing ?

chahank

Please update the CHANGELOG file : https://github.com/CLIMADA-project/climada_python/blob/main/CHANGELOG.md

chahank · 2026-03-27T16:12:08Z

Thanks for the bugfix!

Fix TCTracks.from_FAST duplicate loading from year loop

f8a35ce

simonameiler requested a review from NicolasColombi March 23, 2026 20:52

simonameiler requested review from chahank, emanuel-schmid and peanutfun as code owners March 23, 2026 20:52

simonameiler added bugfix and removed bugfix labels Mar 23, 2026

NicolasColombi requested changes Mar 24, 2026

View reviewed changes

NicolasColombi approved these changes Mar 27, 2026

View reviewed changes

chahank approved these changes Mar 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix TCTracks.from_FAST duplicate loading from year loop#1269

Fix TCTracks.from_FAST duplicate loading from year loop#1269
simonameiler wants to merge 2 commits intodevelopfrom
fix/fast-loader-duplication

simonameiler commented Mar 23, 2026

Uh oh!

NicolasColombi left a comment

Uh oh!

NicolasColombi Mar 24, 2026 •

edited

Loading

Uh oh!

simonameiler Mar 26, 2026

Uh oh!

NicolasColombi left a comment

Uh oh!

chahank left a comment

Uh oh!

chahank commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

simonameiler commented Mar 23, 2026

PR Author Checklist

PR Reviewer Checklist

Uh oh!

NicolasColombi left a comment

Choose a reason for hiding this comment

Uh oh!

NicolasColombi Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simonameiler Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

NicolasColombi left a comment

Choose a reason for hiding this comment

Uh oh!

chahank left a comment

Choose a reason for hiding this comment

Uh oh!

chahank commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NicolasColombi Mar 24, 2026 •

edited

Loading