Skip to content

feat(downloader): ENTSO-E gap repair + resume/cooldown fixes (country_code default DE)#382

Merged
bartzbeielstein merged 1 commit into
developfrom
feat/entsoe-gap-repair
Jun 13, 2026
Merged

feat(downloader): ENTSO-E gap repair + resume/cooldown fixes (country_code default DE)#382
bartzbeielstein merged 1 commit into
developfrom
feat/entsoe-gap-repair

Conversation

@bartzbeielstein

Copy link
Copy Markdown
Collaborator

Reconciles the long-parked fix/entsoe-gap-repair work onto current develop. Because develop had independently reworked download_new_data (the _MAX_BACKFILL_DAYS heal, timeout, collect mode), this is a careful re-apply, not a mechanical rebase.

Motivation: the 2026-06-01/02 incident — ENTSO-E published no data, leaving an interior hole in interim/energy_load.csv the pipeline could neither detect nor heal.

New API (spotforecast2_safe.downloader)

  • repair_data_gaps() — detect interior gaps, heal from already-downloaded raw CSVs first, then targeted downloads for what's still missing; never invents values, raises by default (on_unavailable='use_existing' to proceed on gapped data).
  • find_missing_intervals() — report interior gaps of a DatetimeIndex (mode-based step → hourly and 15-min).
  • download_new_data(..., on_unavailable=...) — keep operating on existing interim data when ENTSO-E stays unreachable after the retry budget.

Fixes in download_new_data() (kept develop's backfill heal + timeout)

  • Resume bug: start=None now reads the interim file via fetch_data(filename=...). The bare fetch_data() raised ValueError unconditionally, so resume silently fell into the 7-day fallback on every incremental run — which also left develop's _MAX_BACKFILL_DAYS heal as dead code. Both work now.
  • Cooldown: keyed on recency of the newest raw entsoe_load_* file, not the width of the requested window (the old check silently skipped sub-24h backfills). A window overlapping a known interim gap bypasses the cooldown.
  • end=None → "now" (not "today 00:00 UTC", which made incremental runs no-ops for most of the day once resume works).
  • end <= start with both explicit → ValueError; a derived empty window is an "already up to date" no-op.
  • Empty/all-NaN ENTSO-E response → WARNING + no raw file (instead of masking the gap as success).
  • Default country_code "FR""DE", matching download_renewable_forecast (DE) / download_day_ahead_price (DE_LU) and the package's focus.

Tests / docs

  • New tests/test_entsoe_gap_repair.py (375 lines); the two width-semantics cooldown tests updated to the recency model.
  • quartodoc reference + sidebar regenerated for the two new public functions.
  • Full suite 2665 passed, 1 skipped; ruff clean.

🤖 Generated with Claude Code

… country_code DE

Reconciles the long-lived fix/entsoe-gap-repair work onto current develop
(develop had independently reworked download_new_data, so this is a re-apply,
not a mechanical rebase).

Field incident: ENTSO-E published no data for 2026-06-01/02, leaving an interior
hole in interim/energy_load.csv that the pipeline could neither detect nor heal.

New API (spotforecast2_safe.downloader):
- repair_data_gaps(): detect interior gaps, heal from already-downloaded raw
  CSVs first, then issue targeted downloads for intervals still missing; never
  invents values, raises by default (on_unavailable='use_existing' opts into
  proceeding with gapped data).
- find_missing_intervals(): report interior gaps of a DatetimeIndex (mode-based
  step, so hourly and 15-min data both work).
- download_new_data(..., on_unavailable=...): opt-in to keep operating on the
  existing interim data when ENTSO-E stays unreachable after the retry budget.

Fixes in download_new_data() (kept develop's _MAX_BACKFILL_DAYS heal + timeout):
- Resume (start=None) now reads the interim file via fetch_data(filename=...).
  The bare fetch_data() call raised ValueError ("filename must be specified")
  unconditionally, so resume silently fell into the 7-day fallback on every
  incremental run -- which also made develop's backfill heal dead code. Now both
  work.
- Cooldown is keyed on the recency of the newest raw entsoe_load_* file, not the
  WIDTH of the requested window (the old check silently skipped sub-24h
  backfills). A window overlapping a known interim gap bypasses the cooldown.
- end=None now means "now" instead of "today 00:00 UTC" (the latter made
  incremental runs no-ops for most of the day once resume works).
- end <= start with both bounds explicit raises ValueError; a derived empty
  window is an "already up to date" no-op.
- An empty/all-NaN ENTSO-E response is logged at WARNING and writes no raw file
  instead of masking the gap as a successful download.
- default country_code "FR" -> "DE", matching download_renewable_forecast (DE)
  / download_day_ahead_price (DE_LU) and the package's German-energy focus.

Tests: new tests/test_entsoe_gap_repair.py (375 lines); the two width-semantics
cooldown tests updated to the recency model. quartodoc reference + sidebar
regenerated for the two new public functions. Full suite 2665 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@bartzbeielstein bartzbeielstein merged commit a2fd985 into develop Jun 13, 2026
10 checks passed
@bartzbeielstein bartzbeielstein deleted the feat/entsoe-gap-repair branch June 13, 2026 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant