Skip to content

LSB-dev/PADELF-PIP

Repository files navigation

A simple logo

padelf

Easy Pandas DataFrame-Access to publicly available electric load forecasting datasets

PyPI version Python License: MIT

padelf provides a minimal Python API to download, cache, and standardize electric load forecasting datasets for research. Every dataset is returned as a pandas DataFrame with a UTC DateTimeIndex and a standardized consumption_kW column.

Installation

pip install padelf

Quick Start

import padelf

# See what's available
padelf.list_datasets()
# ['AEMO', 'ELD', 'ENTSO-E', 'GEFCOM12', 'IHPC', 'ISO-NE', 'NYISO', 'OPSD', 'Pecan-Street', 'RTE-France', 'VEA']

# Load a dataset — one line, sensible defaults
df = padelf.get_dataset("OPSD")
print(df.head())

Output:

                                                     consumption_kW  DE_solar_generation_actual  DE_wind_onshore_generation_actual
datetime
2015-01-01 00:00:00+00:00       41209.0                        NaN                            7568.0
2015-01-01 01:00:00+00:00       40029.0                        NaN                            7666.0
2015-01-01 02:00:00+00:00       38891.0                        NaN                            7637.0

What You Get

Every call to get_dataset() returns a DataFrame with:

  • DateTimeIndex — UTC timezone, equidistant at the dataset's native resolution
  • consumption_kW — Load/consumption column, unit-converted to kilowatts
  • Additional columns — As available in the original dataset (e.g., temperature, solar generation)

Optional Parameters

df = padelf.get_dataset(
        "OPSD",
        resolution="15min",       # Resample to 15-minute intervals
        consumption_unit="MW",    # Keep original MW units
        interpolate_limit="4h",   # Fill gaps up to 4 hours
        cache_dir="/tmp/padelf",  # Custom cache location
)

Available Datasets

Dataset Abbreviation Resolution Region Status
Open Power System Data OPSD 60 min Europe ✅ Ready
Individual Household Power Consumption IHPC 1 min France ✅ Ready
ElectricityLoadDiagrams20112014 ELD 15 min Portugal ✅ Ready
5359 industrial VEA load profiles VEA 15 min Germany ✅ Ready
GEFCom 2012 GEFCOM12 60 min US ⏩ Planned
ENTSO-E Transparency ENTSO-E 60 min Europe ⏩ Planned
ISO New England ISO-NE 60 min US ⏩ Planned
NYISO NYISO 5 min US ⏩ Planned
AEMO AEMO 60 min Australia ⏩ Planned
RTE France RTE-France 30 min France ⏩ Planned
Pecan Street Pecan Street 15 min US ⏩ Planned

a logo banner

Repository Structure

The project uses a src layout with per-dataset YAML configs:

.
├── README.md
├── pyproject.toml                # Build config (hatchling backend)
├── mkdocs.yml                    # Documentation site config
├── LICENSE
├── src/padelf/
│   ├── __init__.py               # Public API: list_datasets(), get_dataset()
│   ├── loader.py                 # Core loader logic: download, cache, parse, standardize
│   ├── utils.py                  # Unit conversion, gap interpolation, resampling
│   └── configs/
│       ├── _template.yaml        # Template for new loader configs
│       ├── OPSD.yaml             # Ready
│       ├── IHPC.yaml             # Ready
│       ├── ELD.yaml              # Ready
│       ├── VEA.yaml              # Ready
│       ├── GEFCOM12.yaml         # Ready (source URL intermittent)
│       ├── ENTSO-E.yaml          # API placeholder
│       ├── ISO-NE.yaml           # API placeholder
│       ├── NYISO.yaml            # API placeholder
│       ├── AEMO.yaml             # API placeholder
│       ├── RTE-France.yaml       # API placeholder
│       └── Pecan-Street.yaml     # API placeholder
├── docs/                         # mkdocs source files
│   ├── index.md
│   ├── getting-started.md
│   ├── api.md
│   └── datasets.md
└── tests/
    ├── test_loader.py
    ├── test_utils.py
    └── test_smoke.py

How It Works

The loader architecture follows a per-dataset config pattern. Each YAML file in src/padelf/configs/ defines a dataset's download URL, file format, column mappings, unit, and preprocessing parameters. When get_dataset() is called, loader.py reads the corresponding config, downloads the file (or uses a local cache), parses it, and applies standardization via utils.py: the load column is renamed to consumption_kW with automatic unit conversion (MW, kWh, MWh to kW), the index is converted to an equidistant UTC DateTimeIndex, gaps up to 2 hours are interpolated by default, and optional resampling is applied if requested. Datasets flagged with requires_api: true in their config raise NotImplementedError with a descriptive message -- these are placeholders for future implementation.

Adding a New Loader

See the Loader Developer Guide for details on the loader architecture and how to add new datasets.

API Placeholder Pattern

Six datasets (ENTSO-E, ISO-NE, NYISO, AEMO, RTE-France, Pecan-Street) are currently configured as API placeholders. Their YAML configs exist with requires_api: true, and calling get_dataset() on them raises NotImplementedError. To convert a placeholder into a working loader, remove the requires_api flag and either provide a direct download URL or implement API-specific download logic in loader.py. Note that ENTSO-E and ISO-NE have direct CSV downloads available and could be implemented as file-based loaders without API integration.

Original Catalog

A simle logo

To explore more datasets, check out the original PADELF Repository.

Citation

If this work has helped you with your scientific work, we would appreciate a proper mention. ❤️

@inproceedings{baur2024datasets,
    title     = {Publicly Available Datasets For Electric Load Forecasting -- An Overview},
    author    = {Baur, Lukas and Chandramouli, Vignesh and Sauer, Alexander},
    booktitle = {6th Conference on Production Systems and Logistics (CPSL 2024)},
    year      = {2024},
    doi       = {10.15488/17659}
}

Changelog

See the Changelog File

License

MIT

About

This PyPi (pip) package gives you access to the most well-known time series datasets for electric load forecasting (and related tasks)

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages