Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
540 changes: 540 additions & 0 deletions LICENSES/ODbL-1.0.txt

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"hash": "2ec09d8e46e065b2f395a0504ddf62b9",
"result": {
"engine": "jupyter",
"markdown": "---\ntitle: calendar.holiday.create_school_holiday_df\n---\n\n\n\n```python\ncalendar.holiday.create_school_holiday_df(\n start,\n end,\n tz='UTC',\n freq='h',\n country_code='DE',\n state='NW',\n)\n```\n\nCreate a DataFrame with a binary school-holiday indicator for a German state.\n\nBuilds a tz-aware time grid over ``[start, end]`` at *freq* and marks\nevery timestamp that falls within a school-holiday period of the requested\nBundesland as ``1``; all others are ``0``. Both edges of each interval\nare inclusive.\n\nData source: OpenHolidays API (https://openholidaysapi.org), ODbL-1.0.\nCoverage: 2022-01-01 to 2027-12-31 for all 16 German Bundesländer.\n\nOnly ``country_code=\"DE\"`` is supported. Requests whose span extends\nbeyond the covered range at either edge raise ``ValueError`` — there is\nno fill or extrapolation.\n\n## Parameters {.doc-section .doc-section-parameters}\n\n| Name | Type | Description | Default |\n|--------------|----------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------|------------|\n| start | [str](`str`) \\| [pd](`pandas`).[Timestamp](`pandas.Timestamp`) | Start date/datetime of the requested grid. | _required_ |\n| end | [str](`str`) \\| [pd](`pandas`).[Timestamp](`pandas.Timestamp`) | End date/datetime of the requested grid (inclusive). | _required_ |\n| tz | [str](`str`) | Timezone for the resulting index. Ignored when *start* or *end* is already a tz-aware ``pd.Timestamp``. | `'UTC'` |\n| freq | [str](`str`) | Pandas-compatible frequency string. Defaults to ``\"h\"`` (hourly). | `'h'` |\n| country_code | [str](`str`) | Must be ``\"DE\"`` (Germany). Any other value raises ``ValueError``. | `'DE'` |\n| state | [str](`str`) | ISO 3166-2 subdivision short code for the Bundesland, e.g. ``\"NW\"`` (North Rhine-Westphalia), ``\"BY\"`` (Bavaria). Defaults to ``\"NW\"``. | `'NW'` |\n\n## Returns {.doc-section .doc-section-returns}\n\n| Name | Type | Description |\n|--------|------------------------------------------------|----------------------------------------------------------------------|\n| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | pd.DataFrame: Single integer column ``is_school_holiday`` (values in |\n| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | ``{0, 1}``; no NaNs) with a tz-aware `DatetimeIndex` at *freq*. |\n\n## Raises {.doc-section .doc-section-raises}\n\n| Name | Type | Description |\n|--------|----------------------------|-----------------------------------------------------------------------------------------------------------------------|\n| | [ValueError](`ValueError`) | If *country_code* is not ``\"DE\"``, or if the requested span extends beyond the dataset validity range at either edge. |\n\n## Examples {.doc-section .doc-section-examples}\n\n\n::: {#e831a63e .cell execution_count=1}\n``` {.python .cell-code}\nfrom spotforecast2_safe.calendar import create_school_holiday_df\n\n# NW Sommerferien 2024: 2024-07-08 → 2024-08-20 (inclusive).\n# Day before (2024-07-07) must be 0; first day (2024-07-08) must be 1.\ndf = create_school_holiday_df(\n \"2024-07-06\", \"2024-07-10\", freq=\"D\", state=\"NW\"\n)\nprint(df)\nassert df.loc[\"2024-07-07\", \"is_school_holiday\"] == 0\nassert df.loc[\"2024-07-08\", \"is_school_holiday\"] == 1\nassert df.loc[\"2024-07-09\", \"is_school_holiday\"] == 1\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n is_school_holiday\n2024-07-06 00:00:00+00:00 0\n2024-07-07 00:00:00+00:00 0\n2024-07-08 00:00:00+00:00 1\n2024-07-09 00:00:00+00:00 1\n2024-07-10 00:00:00+00:00 1\n```\n:::\n:::\n\n\n",
"supporting": [
"calendar.holiday.create_school_holiday_df_files/figure-html"
],
"filters": [],
"includes": {}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"hash": "9f715b6a7ab541ee038cd8c9340d50b4",
"result": {
"engine": "jupyter",
"markdown": "---\ntitle: calendar.holiday.get_school_holiday_features\n---\n\n\n\n```python\ncalendar.holiday.get_school_holiday_features(\n data,\n start,\n cov_end,\n forecast_horizon,\n tz='UTC',\n freq='h',\n country_code='DE',\n state='NW',\n)\n```\n\nBuild per-Bundesland school-holiday indicators and align them to a forecast grid.\n\nGenerates the ``is_school_holiday`` binary indicator via\n`create_school_holiday_df()`, validates temporal coverage with\n`curate_holidays()`, and reindexes onto the full ``[start, cov_end]``\ngrid with ``fill_value=0``.\n\nThe requested span ``[start, cov_end]`` must lie entirely within the\ndataset validity range 2022-01-01 to 2027-12-31. If either edge falls\noutside this range a ``ValueError`` is raised immediately — there is no\nfill or extrapolation.\n\nOnly ``country_code=\"DE\"`` is supported; passing any other value raises\n``ValueError``.\n\n## Parameters {.doc-section .doc-section-parameters}\n\n| Name | Type | Description | Default |\n|------------------|-----------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|------------|\n| data | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | Reference time series DataFrame used for temporal coverage validation inside `curate_holidays()`. | _required_ |\n| start | [Union](`typing.Union`)\\[[str](`str`), [pd](`pandas`).[Timestamp](`pandas.Timestamp`)\\] | Start timestamp. String values are parsed with ``utc=True``. | _required_ |\n| cov_end | [Union](`typing.Union`)\\[[str](`str`), [pd](`pandas`).[Timestamp](`pandas.Timestamp`)\\] | Inclusive end timestamp covering the full forecast horizon. String values are parsed with ``utc=True``. | _required_ |\n| forecast_horizon | [int](`int`) | Number of forecast steps ahead; passed to `curate_holidays()`. | _required_ |\n| tz | [str](`str`) | Timezone applied to the generated index. Defaults to ``\"UTC\"``. | `'UTC'` |\n| freq | [str](`str`) | Pandas-compatible frequency string. Defaults to ``\"h\"``. | `'h'` |\n| country_code | [str](`str`) | Must be ``\"DE\"``. Any other value raises ``ValueError``. | `'DE'` |\n| state | [str](`str`) | ISO 3166-2 subdivision short code for the Bundesland. Defaults to ``\"NW\"`` (North Rhine-Westphalia). | `'NW'` |\n\n## Returns {.doc-section .doc-section-returns}\n\n| Name | Type | Description |\n|--------|------------------------------------------------|-----------------------------------------------------------------------|\n| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | pd.DataFrame: Single integer column ``is_school_holiday`` (values in |\n| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | ``{0, 1}``; no NaNs). The index is a tz-aware `DatetimeIndex` with |\n| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | the requested *freq* and shape ``(len(data) + forecast_horizon, 1)``. |\n\n## Raises {.doc-section .doc-section-raises}\n\n| Name | Type | Description |\n|--------|----------------------------|-------------------------------------------------------------------------------------------------------------------------------------|\n| | [ValueError](`ValueError`) | If *country_code* is not ``\"DE\"``, or if the requested span extends beyond the dataset validity range ``[2022-01-01, 2027-12-31]``. |\n\n## Examples {.doc-section .doc-section-examples}\n\n\n::: {#313d9f34 .cell execution_count=1}\n``` {.python .cell-code}\nimport pandas as pd\nfrom spotforecast2_safe.calendar import get_school_holiday_features\n\nforecast_horizon = 24\nn_data = 48\ndata = pd.DataFrame(\n {\"load\": range(n_data)},\n index=pd.date_range(\"2024-07-06\", periods=n_data, freq=\"h\", tz=\"UTC\"),\n)\nstart = data.index[0]\ncov_end = start + pd.Timedelta(hours=(n_data + forecast_horizon - 1))\n\nfeats = get_school_holiday_features(\n data=data,\n start=start,\n cov_end=cov_end,\n forecast_horizon=forecast_horizon,\n state=\"NW\",\n)\nprint(\"shape:\", feats.shape)\nprint(\"columns:\", feats.columns.tolist())\n# NW Sommerferien 2024: 2024-07-08 is a school holiday (is_school_holiday=1).\nprint(\"2024-07-07 00:00 UTC:\", feats.loc[\"2024-07-07 00:00:00+00:00\", \"is_school_holiday\"])\nprint(\"2024-07-08 00:00 UTC:\", feats.loc[\"2024-07-08 00:00:00+00:00\", \"is_school_holiday\"])\nassert feats.shape == (n_data + forecast_horizon, 1)\nassert feats.loc[\"2024-07-07 00:00:00+00:00\", \"is_school_holiday\"] == 0\nassert feats.loc[\"2024-07-08 00:00:00+00:00\", \"is_school_holiday\"] == 1\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nshape: (72, 1)\ncolumns: ['is_school_holiday']\n2024-07-07 00:00 UTC: 0\n2024-07-08 00:00 UTC: 1\n```\n:::\n:::\n\n\n",
"supporting": [
"calendar.holiday.get_school_holiday_features_files"
],
"filters": [],
"includes": {}
}
}
Loading
Loading