feat(calendar): per-Bundesland school-holiday features (roadmap #6)#364
Merged
Conversation
Adds school-holiday calendar features for all 16 German Bundesländer, sourced from the OpenHolidays API (ODbL-1.0), covering 2022-01-01 to 2027-12-31. Requests outside the covered range raise ValueError (fail-safe; no fill/extrapolation). Only country_code="DE" is supported. Data: - datasets/csv/school_holidays_de.csv (648 rows, 16 states) - datasets/csv/school_holidays_de_meta.csv (validity range metadata) - LICENSES/ODbL-1.0.txt - REUSE sidecar .license files for both CSVs Code: - calendar/holiday.py: create_school_holiday_df() + get_school_holiday_features() - calendar/__init__.py: export both symbols - data/fetch_data.py: load_school_holidays_de() loader - manager/features.py: select_exogenous_features() gains include_school_holiday_features kwarg - configurator/config_multi.py: include_school_holiday_features: bool = False field - multitask/base.py: wires include_school_holiday_features into the concat + select pipeline Docs: - _quarto.yml: adds both new symbols to autosummary and sidebar; also fixes pre-existing sidebar drift by adding create_day_type_df/get_day_type_features - quartodoc reference pages generated for create_school_holiday_df and get_school_holiday_features Tests (tests/test_calendar_school_holiday.py, 30 tests): - determinism, dtype/no-NaN/binary, known NW 2024 vacations (Osterferien, Sommerferien, Herbstferien), state isolation (BY vs NW on 2024-08-21), inclusive edges, hourly broadcast, fail-safe both edges, country_code validation, selector toggle, bundled-data integrity (16 states, schema, row count in [500,650], meta) Full-repo follow-up (spotforecast2, not this repo): - No wiring changes needed in the full repo: include_school_holiday_features is fully wired in sf2-safe's multitask/base.py concat + select blocks. ConfigEntsoe inherits include_school_holiday_features via ConfigMulti (dataclass inheritance). The full repo reads sf2-safe; no additional wiring or re-exports required there. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
FIX 1 — datasets/csv/school_holidays_de.csv: add missing HE Weihnachtsferien 2027-12-23/2028-01-11 record; de-duplicate all 23 truncated-duplicate pairs in MV and SH (keep natural/longer end_date per DATA POLICY); re-sort by (state, start_date). Final row count: 626, 16 states, 0 duplicate keys. FIX 2 — calendar/holiday.py create_school_holiday_df: normalize tz-aware timestamps in range check via .tz_convert(None).normalize() so a boundary timestamp such as 2027-12-31 23:00 UTC no longer falsely raises ValueError. Also fix pd.date_range call to localize string start/end to inferred_tz before mixing with a tz-aware counterpart. FIX 3 — tests: pin HE Weihnachtsferien 2027 (test_he_weihnachtsferien_2027): 2027-12-22 == 0, 2027-12-23 through 2027-12-31 all == 1. FIX 4 — tests: tz-aware boundary tests: end=2027-12-31 23:00 UTC does not raise; end=2028-01-01 00:00 UTC raises ValueError. FIX 5 — data/fetch_data.py load_school_holidays_de: sort returned DataFrame by (state, start_date) as docstring claims. FIX 6 — same docstring: reword datetime64 dtype claim to "resolution depends on the pandas version". FIX 7 — same regeneration note: replace endDate-truncation rule with the natural-form DATA POLICY (keep startDate in range, verbatim endDate). FIX 8 — configurator/config_multi.py: add include_ephemeris_features and include_day_type_features entries to Args and Attributes docblocks; place include_school_holiday_features directly after include_day_type_features to match field order. FIX 9 — calendar/holiday.py create_school_holiday_df: rename params start_date/end_date → start/end, mirroring create_day_type_df; update call site in get_school_holiday_features and docstring. FIX 10 — data/fetch_data.py load_school_holidays_de: remove unused data_home parameter entirely; update signature and docstring. FIX 11 — manager/features.py: inline single-element allowlist for is_school_holiday, eliminating intermediate variable. FIX 12 — multitask/base.py: replace both getattr fallbacks with direct self.config.include_school_holiday_features (field is declared on ConfigMulti). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements roadmap item #6: bundled per-Bundesland school-holiday table + calendar features.
create_school_holiday_df/get_school_holiday_features(calendar family, mirrors the day-type pair; single state-selectedis_school_holidayint column,state="NW"default, DE-only)datasets/csv/school_holidays_de.csv(626 rows, 16 states, 2022-01-01 → 2027-12-31 validity from companion meta CSV; records kept in natural API form)LICENSES/ODbL-1.0.txt. ODbL is share-alike for the database only; attribution and notice are preserved via the sidecars. If ODbL is unacceptable, the fallback is a manual KMK transcription.ValueError(no fill parameter);country_code != "DE"raisesinclude_school_holiday_featuresonConfigMulti(ConfigEntsoe inherits) + selector branch + multitask wiring, mirroringinclude_day_type_featuresFull pipeline green: 2434 passed, ruff/reuse clean, full quarto render OK.
🤖 Generated with Claude Code