gridfm · albanpuech · Apr 22, 2026 · Mar 6, 2026 · Mar 6, 2026 · Mar 6, 2026
diff --git a/README.md b/README.md
@@ -60,9 +60,9 @@ gridfm_graphkit <command> [OPTIONS]
 Available commands:
 
 * `train` - Train a new model from scratch
-* `finetune` – Fine-tune an existing pre-trained model
-* `evaluate` – Evaluate model performance on a dataset
-* `predict` – Run inference and save predictions
+* `finetune` - Fine-tune an existing pre-trained model
+* `evaluate` - Evaluate model performance on a dataset
+* `predict` - Run inference and save predictions
 
 ---
 
@@ -74,13 +74,22 @@ gridfm_graphkit train --config path/to/config.yaml
 
 ### Arguments
 
-| Argument         | Type   | Description                                                      | Default |
-| ---------------- | ------ | ---------------------------------------------------------------- | ------- |
-| `--config`       | `str`  | **Required**. Path to the training configuration YAML file.      | `None`       |
-| `--exp_name`     | `str`  | MLflow experiment name.                                           | `timestamp`  |
-| `--run_name`     | `str`  | MLflow run name.                                                  | `run`        |
-| `--log_dir`      | `str`  | MLflow tracking/logging directory.                                | `mlruns`     |
-| `--data_path`    | `str`  | Root dataset directory.                                           | `data`       |
+| Argument | Type | Description | Default |
+| -------- | ---- | ----------- | ------- |
+| `--config` | `str` | **Required**. Path to the training configuration YAML file. | `None` |
+| `--exp_name` | `str` | MLflow experiment name. | `timestamp` |
+| `--run_name` | `str` | MLflow run name. | `run` |
+| `--log_dir` | `str` | MLflow tracking/logging directory. | `mlruns` |
+| `--data_path` | `str` | Root dataset directory. | `data` |
+| `--compile [MODE]` | `str` | Enable `torch.compile` mode. Valid values: `default`, `reduce-overhead`, `max-autotune`, `max-autotune-no-cudagraphs`. If flag is passed without a value, mode is `default`. | `None` |
+| `--bfloat16` | `flag` | Cast model to `torch.bfloat16` (`model.to(torch.bfloat16)`). | `False` |
+| `--tf32` | `flag` | Enable TF32 on Ampere+ GPUs via `torch.set_float32_matmul_precision("high")`. | `False` |
+| `--dataset_wrapper` | `str` | Registered dataset wrapper name (see `DATASET_WRAPPER_REGISTRY`), e.g. `SharedMemoryCacheDataset`. | `None` |
+| `--plugins` | `list[str]` | Python packages to import for plugin registration, e.g. `gridfm_graphkit_ee`. | `[]` |
+| `--num_workers` | `int` | Override `data.workers` from YAML. Use `0` to debug worker crashes. | `None` |
+| `--dataset_wrapper_cache_dir` | `str` | Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population. | `None` |
+| `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` |
+| `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split. | `False` |
 
 ### Examples
 
@@ -100,14 +109,23 @@ gridfm_graphkit finetune --config path/to/config.yaml --model_path path/to/model
 
 ### Arguments
 
-| Argument       | Type  | Description                                     | Default   |
-| -------------- | ----- | ----------------------------------------------- | --------- |
-| `--config`     | `str` | **Required**. Fine-tuning configuration file.   | `None`    |
-| `--model_path` | `str` | **Required**. Path to a pre-trained model state dict. | `None`    |
-| `--exp_name`   | `str` | MLflow experiment name.                         | timestamp |
-| `--run_name`   | `str` | MLflow run name.                                | `run`     |
-| `--log_dir`    | `str` | MLflow logging directory.                       | `mlruns`  |
-| `--data_path`  | `str` | Root dataset directory.                         | `data`    |
+| Argument | Type | Description | Default |
+| -------- | ---- | ----------- | ------- |
+| `--config` | `str` | **Required**. Fine-tuning configuration file. | `None` |
+| `--model_path` | `str` | **Required**. Path to a pre-trained model state dict. | `None` |
+| `--exp_name` | `str` | MLflow experiment name. | `timestamp` |
+| `--run_name` | `str` | MLflow run name. | `run` |
+| `--log_dir` | `str` | MLflow logging directory. | `mlruns` |
+| `--data_path` | `str` | Root dataset directory. | `data` |
+| `--compile [MODE]` | `str` | Enable `torch.compile` mode. Valid values: `default`, `reduce-overhead`, `max-autotune`, `max-autotune-no-cudagraphs`. If flag is passed without a value, mode is `default`. | `None` |
+| `--bfloat16` | `flag` | Cast model to `torch.bfloat16` (`model.to(torch.bfloat16)`). | `False` |
+| `--tf32` | `flag` | Enable TF32 on Ampere+ GPUs via `torch.set_float32_matmul_precision("high")`. | `False` |
+| `--dataset_wrapper` | `str` | Registered dataset wrapper name (see `DATASET_WRAPPER_REGISTRY`), e.g. `SharedMemoryCacheDataset`. | `None` |
+| `--plugins` | `list[str]` | Python packages to import for plugin registration, e.g. `gridfm_graphkit_ee`. | `[]` |
+| `--num_workers` | `int` | Override `data.workers` from YAML. Use `0` to debug worker crashes. | `None` |
+| `--dataset_wrapper_cache_dir` | `str` | Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population. | `None` |
+| `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` |
+| `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split. | `False` |
 
 
 ---
@@ -120,17 +138,25 @@ gridfm_graphkit evaluate --config path/to/eval.yaml --model_path path/to/model.p
 
 ### Arguments
 
-| Argument              | Type  | Description                                                                                                   | Default   |
-| --------------------- | ----- | ------------------------------------------------------------------------------------------------------------- | --------- |
-| `--config`            | `str` | **Required**. Path to evaluation config.                                                                      | `None`    |
-| `--model_path`        | `str` | Path to the trained model state dict.                                                                         | `None`    |
-| `--normalizer_stats`  | `str` | Path to `normalizer_stats.pt` from a training run. Restores `fit_on_train` normalizers from saved statistics instead of re-fitting on the current data split. | `None`    |
-| `--exp_name`          | `str` | MLflow experiment name.                                                                                       | timestamp |
-| `--run_name`          | `str` | MLflow run name.                                                                                              | `run`     |
-| `--log_dir`           | `str` | MLflow logging directory.                                                                                     | `mlruns`  |
-| `--data_path`         | `str` | Dataset directory.                                                                                            | `data`    |
-| `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split.                                      | `False`   |
-| `--save_output`       | `flag` | Save predictions as `<grid_name>_predictions.parquet` under MLflow artifacts (`.../artifacts/test`).       | `False`   |
+| Argument | Type | Description | Default |
+| -------- | ---- | ----------- | ------- |
+| `--config` | `str` | **Required**. Path to evaluation config. | `None` |
+| `--model_path` | `str` | Path to the trained model state dict. | `None` |
+| `--normalizer_stats` | `str` | Path to `normalizer_stats.pt` from a training run. Restores `fit_on_train` normalizers from saved statistics instead of re-fitting on current split. | `None` |
+| `--exp_name` | `str` | MLflow experiment name. | `timestamp` |
+| `--run_name` | `str` | MLflow run name. | `run` |
+| `--log_dir` | `str` | MLflow logging directory. | `mlruns` |
+| `--data_path` | `str` | Dataset directory. | `data` |
+| `--compile [MODE]` | `str` | Enable `torch.compile` mode. Valid values: `default`, `reduce-overhead`, `max-autotune`, `max-autotune-no-cudagraphs`. If flag is passed without a value, mode is `default`. | `None` |
+| `--bfloat16` | `flag` | Cast model to `torch.bfloat16` (`model.to(torch.bfloat16)`). | `False` |
+| `--tf32` | `flag` | Enable TF32 on Ampere+ GPUs via `torch.set_float32_matmul_precision("high")`. | `False` |
+| `--dataset_wrapper` | `str` | Registered dataset wrapper name (see `DATASET_WRAPPER_REGISTRY`), e.g. `SharedMemoryCacheDataset`. | `None` |
+| `--plugins` | `list[str]` | Python packages to import for plugin registration, e.g. `gridfm_graphkit_ee`. | `[]` |
+| `--num_workers` | `int` | Override `data.workers` from YAML. Use `0` to debug worker crashes. | `None` |
+| `--dataset_wrapper_cache_dir` | `str` | Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population. | `None` |
+| `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` |
+| `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split. | `False` |
+| `--save_output` | `flag` | Save predictions as `<grid_name>_predictions.parquet` under MLflow artifacts (`.../artifacts/test`). | `False` |
 
 ### Example with saved normalizer stats
 
@@ -156,16 +182,44 @@ gridfm_graphkit predict --config path/to/config.yaml --model_path path/to/model.
 
 ### Arguments
 
-| Argument              | Type  | Description                                                                                                   | Default   |
-| --------------------- | ----- | ------------------------------------------------------------------------------------------------------------- | --------- |
-| `--config`            | `str` | **Required**. Path to prediction config file.                                                                 | `None`    |
-| `--model_path`        | `str` | Path to the trained model state dict.                                                                         | `None`    |
-| `--normalizer_stats`  | `str` | Path to `normalizer_stats.pt` from a training run. Restores `fit_on_train` normalizers from saved statistics. | `None`    |
-| `--exp_name`          | `str` | MLflow experiment name.                                                                                       | timestamp |
-| `--run_name`          | `str` | MLflow run name.                                                                                              | `run`     |
-| `--log_dir`           | `str` | MLflow logging directory.                                                                                     | `mlruns`  |
-| `--data_path`         | `str` | Dataset directory.                                                                                            | `data`    |
-| `--output_path`       | `str` | Directory where predictions are saved as `<grid_name>_predictions.parquet`.                                  | `data`    |
+| Argument | Type | Description | Default |
+| -------- | ---- | ----------- | ------- |
+| `--config` | `str` | **Required**. Path to prediction config file. | `None` |
+| `--model_path` | `str` | Path to trained model state dict. Optional; may be defined in config. | `None` |
+| `--normalizer_stats` | `str` | Path to `normalizer_stats.pt` from a training run. Restores `fit_on_train` normalizers from saved statistics. | `None` |
+| `--exp_name` | `str` | MLflow experiment name. | `timestamp` |
+| `--run_name` | `str` | MLflow run name. | `run` |
+| `--log_dir` | `str` | MLflow logging directory. | `mlruns` |
+| `--data_path` | `str` | Dataset directory. | `data` |
+| `--dataset_wrapper` | `str` | Registered dataset wrapper name (see `DATASET_WRAPPER_REGISTRY`), e.g. `SharedMemoryCacheDataset`. | `None` |
+| `--plugins` | `list[str]` | Python packages to import for plugin registration, e.g. `gridfm_graphkit_ee`. | `[]` |
+| `--num_workers` | `int` | Override `data.workers` from YAML. Use `0` to debug worker crashes. | `None` |
+| `--dataset_wrapper_cache_dir` | `str` | Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population. | `None` |
+| `--output_path` | `str` | Directory where predictions are saved as `<grid_name>_predictions.parquet`. | `data` |
+| `--compile [MODE]` | `str` | Enable `torch.compile` mode. Valid values: `default`, `reduce-overhead`, `max-autotune`, `max-autotune-no-cudagraphs`. If flag is passed without a value, mode is `default`. | `None` |
+| `--bfloat16` | `flag` | Cast model to `torch.bfloat16` (`model.to(torch.bfloat16)`). | `False` |
+| `--tf32` | `flag` | Enable TF32 on Ampere+ GPUs via `torch.set_float32_matmul_precision("high")`. | `False` |
+| `--profiler` | `str` | Enable Lightning profiler (`simple`, `advanced`, `pytorch`). | `None` |
+
+---
+
+## Benchmarking Dataloader Throughput
+
+```bash
+gridfm_graphkit benchmark --config path/to/config.yaml
+```
+
+### Arguments
+
+| Argument | Type | Description | Default |
+| -------- | ---- | ----------- | ------- |
+| `--config` | `str` | **Required**. Path to configuration YAML file. | `None` |
+| `--data_path` | `str` | Root dataset directory. | `data` |
+| `--epochs` | `int` | Number of epochs to iterate through the train dataloader. | `3` |
+| `--dataset_wrapper` | `str` | Registered dataset wrapper name (see `DATASET_WRAPPER_REGISTRY`), e.g. `SharedMemoryCacheDataset`. | `None` |
+| `--dataset_wrapper_cache_dir` | `str` | Directory for dataset wrapper disk cache. | `None` |
+| `--num_workers` | `int` | Override `data.workers` from YAML. | `None` |
+| `--plugins` | `list[str]` | Python packages to import for plugin registration. | `[]` |
 
 Use built-in help for full command details:
 

diff --git a/docs/datasets/data_modules.md b/docs/datasets/data_modules.md
diff --git a/docs/datasets/data_normalization.md b/docs/datasets/data_normalization.md
diff --git a/docs/datasets/powergrid.md b/docs/datasets/powergrid.md
diff --git a/docs/datasets/transforms.md b/docs/datasets/transforms.md
diff --git a/docs/install/installation.md b/docs/install/installation.md
@@ -1,46 +1,32 @@
 # Installation
 
-You can install `gridfm-graphkit` directly from PyPI:
+The steps below mirror the [README](https://github.com/gridfm/gridfm-graphkit/blob/main/README.md#installation). Run them from the root of a local clone or source checkout of the repository.
 
-```bash
-pip install gridfm-graphkit
-```
-
-For GPU support and compatibility with PyTorch Geometric's scatter operations, install PyTorch (and optionally CUDA) first, then install the matching `torch-scatter` wheel. See [PyTorch and torch-scatter](#pytorch-and-torch-scatter-optional) below.
-
----
-
-## Development Setup
-
-To contribute or develop locally, clone the repository and install in editable mode. Use Python 3.10, 3.11, or 3.12 (3.12 is recommended).
+Create and activate a virtual environment (make sure you use the right python version = 3.10, 3.11 or 3.12. I highly recommend 3.12)
 
 ```bash
-git clone git@github.com:gridfm/gridfm-graphkit.git
-cd gridfm-graphkit
 python -m venv venv
 source venv/bin/activate
-pip install -e .
 ```
 
-### PyTorch and torch-scatter (optional)
+Install gridfm-graphkit in editable mode
 
-If you need GPU acceleration or PyTorch Geometric scatter ops (used by the library), install PyTorch and the matching `torch-scatter` wheel:
+```bash
+pip install -e .
+```
 
-1. Install PyTorch (see [pytorch.org](https://pytorch.org/) for your platform and CUDA version).
+Get PyTorch + CUDA version for torch-scatter
 
-2. Get your Torch + CUDA version string:
-   ```bash
-   TORCH_CUDA_VERSION=$(python -c "import torch; print(torch.__version__ + ('+cpu' if torch.version.cuda is None else ''))")
-   ```
+```bash
+TORCH_CUDA_VERSION=$(python -c "import torch; print(torch.__version__ + ('+cpu' if torch.version.cuda is None else ''))")
+```
 
-3. Install the correct `torch-scatter` wheel:
-   ```bash
-   pip install torch-scatter -f https://data.pyg.org/whl/torch-${TORCH_CUDA_VERSION}.html
-   ```
+Install the correct torch-scatter wheel
 
----
+```bash
+pip install torch-scatter -f https://data.pyg.org/whl/torch-${TORCH_CUDA_VERSION}.html
+```
 
-## Optional extras
 
 For documentation generation and unit testing, install with the optional `dev` and `test` extras:
 

diff --git a/docs/models/models.md b/docs/models/models.md