sequential-parameter-optimization · bartzbeielstein · Jun 14, 2026 · Jun 14, 2026
@@ -230,6 +230,16 @@ website:
                 file: docs/reference/processing.shape_check.ShapeCheckReport.qmd
               - text: "check_forecast_shape"
                 file: docs/reference/processing.shape_check.check_forecast_shape.qmd
+              - text: "LevelCheckReport"
+                file: docs/reference/processing.shape_check.LevelCheckReport.qmd
+              - text: "check_forecast_level"
+                file: docs/reference/processing.shape_check.check_forecast_level.qmd
+              - text: "apply_level_correction"
+                file: docs/reference/processing.shape_check.apply_level_correction.qmd
+              - text: "blend_with_prior"
+                file: docs/reference/processing.blend.blend_with_prior.qmd
+              - text: "score_forecasts"
+                file: docs/reference/processing.forecast_scoring.score_forecasts.qmd
 
           - section: "Forecaster"
             contents:
@@ -672,6 +682,11 @@ quartodoc:
         - processing.n2n_predict_with_covariates.n2n_predict_with_covariates
         - processing.shape_check.ShapeCheckReport
         - processing.shape_check.check_forecast_shape
+        - processing.shape_check.LevelCheckReport
+        - processing.shape_check.check_forecast_level
+        - processing.shape_check.apply_level_correction
+        - processing.blend.blend_with_prior
+        - processing.forecast_scoring.score_forecasts
 
     # ── Forecaster ────────────────────────────────────────────────────────────
     - title: "Forecaster"

@@ -88,6 +88,11 @@ Utilities for aggregated and n-to-n predictions.
 | [processing.n2n_predict_with_covariates.n2n_predict_with_covariates](processing.n2n_predict_with_covariates.n2n_predict_with_covariates.qmd#spotforecast2_safe.processing.n2n_predict_with_covariates.n2n_predict_with_covariates) | End-to-end recursive forecasting with exogenous covariates. |
 | [processing.shape_check.ShapeCheckReport](processing.shape_check.ShapeCheckReport.qmd#spotforecast2_safe.processing.shape_check.ShapeCheckReport) | Immutable result of a forecast shape plausibility check. |
 | [processing.shape_check.check_forecast_shape](processing.shape_check.check_forecast_shape.qmd#spotforecast2_safe.processing.shape_check.check_forecast_shape) | Measure correlation and daily-range ratio between a forecast and its reference. |
+| [processing.shape_check.LevelCheckReport](processing.shape_check.LevelCheckReport.qmd#spotforecast2_safe.processing.shape_check.LevelCheckReport) | Immutable result of a forecast *level* (systematic-bias) check. |
+| [processing.shape_check.check_forecast_level](processing.shape_check.check_forecast_level.qmd#spotforecast2_safe.processing.shape_check.check_forecast_level) | Measure the systematic level offset between a forecast and its reference. |
+| [processing.shape_check.apply_level_correction](processing.shape_check.apply_level_correction.qmd#spotforecast2_safe.processing.shape_check.apply_level_correction) | Shift a forecast so its central level matches a reference (debias). |
+| [processing.blend.blend_with_prior](processing.blend.blend_with_prior.qmd#spotforecast2_safe.processing.blend.blend_with_prior) | Convex-blend a model forecast with an external prior. |
+| [processing.forecast_scoring.score_forecasts](processing.forecast_scoring.score_forecasts.qmd#spotforecast2_safe.processing.forecast_scoring.score_forecasts) | Score several forecasts against a shared actual and rank them. |
 
 ## Forecaster
 

@@ -0,0 +1,56 @@
+# processing.blend.blend_with_prior { #spotforecast2_safe.processing.blend.blend_with_prior }
+
+```python
+processing.blend.blend_with_prior(model_forecast, prior, *, weight)
+```
+
+Convex-blend a model forecast with an external prior.
+
+Returns ``(1 - weight) * model_forecast + weight * prior`` on the index
+intersection of the two series.  ``weight`` is the trust placed in the
+prior: ``0.0`` returns the model forecast unchanged (prior ignored),
+``1.0`` returns the prior, and intermediate values interpolate.  This is the
+correct lever for down-weighting a near-oracle prior whose influence a
+tree model cannot be tuned through feature scaling.
+
+The function is **pure**: it does not mutate its inputs and emits no
+warnings.  The result carries ``model_forecast``'s name.
+
+## Parameters {.doc-section .doc-section-parameters}
+
+| Name           | Type                                     | Description                                                                             | Default    |
+|----------------|------------------------------------------|-----------------------------------------------------------------------------------------|------------|
+| model_forecast | [pd](`pandas`).[Series](`pandas.Series`) | The trained model's forecast.                                                           | _required_ |
+| prior          | [pd](`pandas`).[Series](`pandas.Series`) | The external prior to blend in (e.g. the ENTSO-E day-ahead forecast), aligned by index. | _required_ |
+| weight         | [float](`float`)                         | Blend weight in ``[0.0, 1.0]`` — the trust placed in ``prior``.                         | _required_ |
+
+## Returns {.doc-section .doc-section-returns}
+
+| Name   | Type                                     | Description                                                 |
+|--------|------------------------------------------|-------------------------------------------------------------|
+|        | [pd](`pandas`).[Series](`pandas.Series`) | A new ``pd.Series`` over the index intersection, named like |
+|        | [pd](`pandas`).[Series](`pandas.Series`) | ``model_forecast``.                                         |
+
+## Raises {.doc-section .doc-section-raises}
+
+| Name   | Type                       | Description                                                                           |
+|--------|----------------------------|---------------------------------------------------------------------------------------|
+|        | [TypeError](`TypeError`)   | When ``model_forecast`` or ``prior`` is not a ``pd.Series``.                          |
+|        | [ValueError](`ValueError`) | When ``weight`` is outside ``[0.0, 1.0]`` or the two series share no index positions. |
+
+## Examples {.doc-section .doc-section-examples}
+
+```{python}
+import pandas as pd
+from spotforecast2_safe.processing.blend import blend_with_prior
+
+idx = pd.date_range("2026-06-13 00:00", periods=4, freq="h", tz="UTC")
+model = pd.Series([100.0, 110.0, 120.0, 130.0], index=idx, name="y0")
+prior = pd.Series([140.0, 140.0, 140.0, 140.0], index=idx)
+
+# weight=0 -> model unchanged; weight=1 -> prior; 0.25 -> 75/25 mix.
+print(blend_with_prior(model, prior, weight=0.0).tolist())
+print(blend_with_prior(model, prior, weight=1.0).tolist())
+print(blend_with_prior(model, prior, weight=0.25).tolist())
+assert blend_with_prior(model, prior, weight=0.0).equals(model)
+```
@@ -0,0 +1,63 @@
+# processing.forecast_scoring.score_forecasts { #spotforecast2_safe.processing.forecast_scoring.score_forecasts }
+
+```python
+processing.forecast_scoring.score_forecasts(
+    forecasts,
+    actual,
+    *,
+    metrics=SUPPORTED_METRICS,
+)
+```
+
+Score several forecasts against a shared actual and rank them.
+
+Each forecast is aligned to ``actual`` on the index intersection and scored
+on the requested ``metrics``.  The result is a tidy table indexed by
+approach name, with one column per metric plus an ``n`` column (overlap
+length), sorted ascending by the first requested metric so the best
+approach is the top row.
+
+This is **pure**: no logging, no plotting, no mutation.  Use it to compare,
+for example, a four-zone bottom-up sum against a single combined model
+(compute each approach's forecast first, e.g. via ``backtesting_forecaster``).
+
+## Parameters {.doc-section .doc-section-parameters}
+
+| Name      | Type                                                                                           | Description                                                                                                                                                                                   | Default             |
+|-----------|------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------|
+| forecasts | [Mapping](`collections.abc.Mapping`)\[[str](`str`), [pd](`pandas`).[Series](`pandas.Series`)\] | Mapping of approach name to its forecast series.                                                                                                                                              | _required_          |
+| actual    | [pd](`pandas`).[Series](`pandas.Series`)                                                       | The ground-truth series every forecast is scored against.                                                                                                                                     | _required_          |
+| metrics   | [tuple](`tuple`)\[[str](`str`), ...\]                                                          | Subset of `SUPPORTED_METRICS` to compute, in output order. ``"mae"``, ``"rmse"``, and ``"bias"`` are in the units of the series; ``"mape"`` is a percentage. The ranking uses ``metrics[0]``. | `SUPPORTED_METRICS` |
+
+## Returns {.doc-section .doc-section-returns}
+
+| Name   | Type                                           | Description                                              |
+|--------|------------------------------------------------|----------------------------------------------------------|
+|        | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | A ``pd.DataFrame`` indexed by approach name with columns |
+|        | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | ``[*metrics, "n"]``, sorted ascending by ``metrics[0]``. |
+
+## Raises {.doc-section .doc-section-raises}
+
+| Name   | Type                       | Description                                                                                               |
+|--------|----------------------------|-----------------------------------------------------------------------------------------------------------|
+|        | [TypeError](`TypeError`)   | When ``actual`` is not a ``pd.Series`` or a forecast value is not a ``pd.Series``.                        |
+|        | [ValueError](`ValueError`) | When ``actual`` is empty, ``forecasts`` is empty, or ``metrics`` contains an unsupported name / is empty. |
+
+## Examples {.doc-section .doc-section-examples}
+
+```{python}
+import pandas as pd
+from spotforecast2_safe.processing.forecast_scoring import score_forecasts
+
+idx = pd.date_range("2026-06-13 00:00", periods=24, freq="h", tz="UTC")
+actual = pd.Series([43_858.0] * 24, index=idx)
+
+forecasts = {
+    "combined": actual + 300.0,        # small mixed-ish offset
+    "four_zone_sum": actual + 1_780.0,  # flat over-prediction
+}
+table = score_forecasts(forecasts, actual, metrics=("mae", "bias"))
+print(table.round(2).to_string())
+# combined ranks first (lower MAE).
+assert table.index[0] == "combined"
+```
@@ -0,0 +1,48 @@
+# processing.shape_check.LevelCheckReport { #spotforecast2_safe.processing.shape_check.LevelCheckReport }
+
+```python
+processing.shape_check.LevelCheckReport(
+    n_overlap,
+    statistic,
+    forecast_level,
+    reference_level,
+    offset,
+    rel_offset,
+    tol,
+)
+```
+
+Immutable result of a forecast *level* (systematic-bias) check.
+
+Where `ShapeCheckReport` answers "does the forecast track the daily
+*profile*", this answers "does the forecast sit at the right *level*".  It
+captures a near-constant offset of the whole forecast against a reference —
+the failure mode behind the 2026-06-13 team_4 miss, where the forecast
+over-predicted every hour by a flat ~1.8 GW (``bias == MAE``).
+
+## Attributes {.doc-section .doc-section-attributes}
+
+| Name            | Type             | Description                                                                                                                                                  |
+|-----------------|------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| n_overlap       | [int](`int`)     | Number of aligned (overlapping) index positions used. When below the evaluable minimum, ``skipped`` is ``True`` and the numeric fields are ``float('nan')``. |
+| statistic       | [str](`str`)     | Central-tendency statistic used for both levels — either ``"median"`` (robust, the default) or ``"mean"``.                                                   |
+| forecast_level  | [float](`float`) | ``statistic`` of the forecast over the overlap.                                                                                                              |
+| reference_level | [float](`float`) | ``statistic`` of the reference over the overlap.                                                                                                             |
+| offset          | [float](`float`) | ``forecast_level - reference_level`` (signed; positive means the forecast sits high, i.e. systematic over-prediction).                                       |
+| rel_offset      | [float](`float`) | ``offset / abs(reference_level)`` (signed). ``float('nan')`` when the reference level is zero.                                                               |
+| tol             | [float](`float`) | Relative-offset tolerance for ``biased`` (passed through from `check_forecast_level`).                                                                       |
+
+## Examples {.doc-section .doc-section-examples}
+
+```{python}
+from spotforecast2_safe.processing.shape_check import LevelCheckReport
+
+# Forecast sits 4 % high vs the reference -> biased at tol=0.02.
+r = LevelCheckReport(
+    n_overlap=24, statistic="median",
+    forecast_level=45_600.0, reference_level=43_858.0,
+    offset=1_742.0, rel_offset=0.0397, tol=0.02,
+)
+print("biased:", r.biased, "rel_offset:", round(r.rel_offset, 4))
+assert r.biased and not r.skipped
+```
@@ -0,0 +1,62 @@
+# processing.shape_check.apply_level_correction { #spotforecast2_safe.processing.shape_check.apply_level_correction }
+
+```python
+processing.shape_check.apply_level_correction(
+    y,
+    reference,
+    *,
+    statistic='median',
+    min_overlap=12,
+)
+```
+
+Shift a forecast so its central level matches a reference (debias).
+
+Estimates the constant offset ``statistic(y) - statistic(reference)`` over
+the index overlap and subtracts it from **every** value of ``y``, removing a
+systematic flat bias while preserving the daily shape.  This is the
+post-hoc correction for the failure `check_forecast_level` detects.
+
+The returned series keeps ``y``'s full index, name, and ordering; only the
+level is shifted.  The function is pure (no mutation of the inputs).
+
+## Parameters {.doc-section .doc-section-parameters}
+
+| Name        | Type                                     | Description                                                                                              | Default    |
+|-------------|------------------------------------------|----------------------------------------------------------------------------------------------------------|------------|
+| y           | [pd](`pandas`).[Series](`pandas.Series`) | Forecast series to correct.                                                                              | _required_ |
+| reference   | [pd](`pandas`).[Series](`pandas.Series`) | Reference whose level ``y`` should be aligned to.                                                        | _required_ |
+| statistic   | [str](`str`)                             | ``"median"`` (default) or ``"mean"`` — must match the estimator you would use in `check_forecast_level`. | `'median'` |
+| min_overlap | [int](`int`)                             | Minimum overlap required to estimate the offset.                                                         | `12`       |
+
+## Returns {.doc-section .doc-section-returns}
+
+| Name   | Type                                     | Description                                                             |
+|--------|------------------------------------------|-------------------------------------------------------------------------|
+|        | [pd](`pandas`).[Series](`pandas.Series`) | A new ``pd.Series`` equal to ``y - offset`` (same index/name as ``y``). |
+
+## Raises {.doc-section .doc-section-raises}
+
+| Name   | Type                       | Description                                                                                                                       |
+|--------|----------------------------|-----------------------------------------------------------------------------------------------------------------------------------|
+|        | [TypeError](`TypeError`)   | When ``y`` or ``reference`` is not a ``pd.Series``.                                                                               |
+|        | [ValueError](`ValueError`) | When ``y``/``reference`` is empty, ``statistic`` is invalid, or the overlap is smaller than ``min_overlap`` (no reliable offset). |
+
+## Examples {.doc-section .doc-section-examples}
+
+```{python}
+import pandas as pd
+from spotforecast2_safe.processing.shape_check import (
+    apply_level_correction, check_forecast_level,
+)
+
+idx = pd.date_range("2026-06-13 00:00", periods=24, freq="h", tz="UTC")
+actual = pd.Series([43_000.0 + 3_000.0 * (i % 12) / 12 for i in range(24)],
+                   index=idx)
+biased = actual + 1_800.0  # flat over-prediction
+
+corrected = apply_level_correction(biased, actual)
+print("offset before:", round(check_forecast_level(biased, actual).offset))
+print("offset after :", round(check_forecast_level(corrected, actual).offset))
+assert abs(check_forecast_level(corrected, actual).offset) < 1.0
+```