Add PPO RL controller, scenario library, and evaluation pipeline by ZhiruiLiang · Pull Request #12 · gpu2grid/openg2g

ZhiruiLiang · 2026-05-04T04:29:15Z

Library code (openg2g/)

New module openg2g/controller/ppo.py — PPOBatchSizeController (single-site) and SharedPPOBatchSizeController (multi-site) that wrap a trained stable-baselines3 PPO policy as a Controller. Sits alongside the existing RuleBasedBatchSizeController and OFOBatchSizeController.
New subpackage openg2g/rl/ with env.py — a Gymnasium environment exposing the simulator as an RL training target. Provides structured observations (per-zone / per-bus / system-summary modes), composable rewards (voltage / throughput / latency / switching), and scenario sampling from a pre-built library. This is what train_ppo.py learns against.
Modified openg2g/controller/rule_based.py — tightened the default deadband for finer voltage tracking and added a zone_buses argument for zone-local observation (used in multi-DC ieee123 to give each site credit only for its own zone).
Modified openg2g/grid/opendss.py — single tiny change: downgrade a multi-bank RegControl message from info to debug to avoid log spam on ieee34.

Examples (examples/offline/)

New examples/offline/train_ppo.py — PPO training entrypoint. Wraps BatchSizeEnv in VecNormalize, runs stable-baselines3 PPO, saves model +VecNormalize stats + per-checkpoint snapshots + TensorBoard logs.
New examples/offline/build_scenario_library.py — generates randomized PV / TVL / inference-ramp scenarios, screens them by running baseline + OFO and rejecting cases with no learning signal, writes a library.pkl for the trainer.
New examples/offline/evaluate_controllers.py — held-out scenario eval that runs baseline / OFO / rule-based / PPO on the same scenarios and produces side-by-side voltage and throughput metrics (CSV + plots).
Modified examples/offline/systems.py — adds the PPO-side infrastructure layered on top of master's feeder constants: DCSite dataclass that bundles deployments with ReplicaSchedules, hardcoded model spec list (ALL_MODEL_SPECS), randomize_scenario / materialize_scenario helpers, ScenarioOpenDSSGrid for randomized PV/TVL, and with_ramp convenience for experiment factories.
Modified examples/offline/sweep_dc_locations.py — extends the existing 1-D and 2-D bus sweeps with a zone-constrained 3-phase sweep for ieee123 (Phase 1 screening per zone, Phase 2 combination, optional Phase 3 refinement). Also migrated to master's new grid.attach_dc(...) and Coordinator(datacenters=[...]) APIs.

Documentation (docs/)

New docs/examples/rl-controller.md — end-to-end walkthrough of the 3-stage PPO workflow: build scenario library → train PPO → evaluate.
Modified docs/examples/voltage-regulation-strategies.md — adds PPO as a fourth control strategy alongside baseline / rule-based / OFO, with a cross-link to the new RL doc.
Modified _zensical.toml — nav entry for the new RL example doc.

gemini-code-assist

Code Review

This pull request introduces a comprehensive Reinforcement Learning (RL) framework for voltage regulation using Proximal Policy Optimization (PPO), including training environments, controllers, and evaluation scripts. Key additions include a Gymnasium-compatible environment, PPO-based controllers, and utilities for generating scenario libraries and benchmarking against model-based and rule-based strategies. Feedback focuses on significant code duplication between the new scripts and core modules, particularly regarding profile generation and grid definitions. Reviewers also highlighted design concerns such as fragile file-searching logic for normalization stats, cross-script dependencies, and a functional limitation in the shared multi-site controller regarding command routing. Refactoring is recommended to centralize shared logic and improve the modularity of the scenario handling and simulation setup.

…ils.py

…l-dependency extra

jaywonchung · 2026-05-11T02:31:57Z

@gemini review

gemini-code-assist

Code Review

This pull request introduces a comprehensive reinforcement learning (RL) workflow for voltage regulation using Proximal Policy Optimization (PPO). Key additions include a Gymnasium environment, training and evaluation scripts, scenario library generation, and detailed documentation. To support this, the Coordinator was refactored to allow per-tick stepping and external command dispatch, and the rule-based controller was enhanced with zone-local observation capabilities. Feedback is provided regarding the .gitignore configuration, specifically suggesting the removal of leading slashes from Slurm and launch script patterns to ensure they are ignored across all subdirectories.

Zhirui Liang added 4 commits May 3, 2026 23:50

Add PPO RL controller, scenario library, and evaluation pipeline

3f0c9e6

[Clean] Untrack repo-root paper_plots.mplstyle and plots.py

29c14e1

[Clean] Untrack verify_post_merge_migration.sh

fa9efb9

[Docs] update rl-controller.md

a9a36bb

gemini-code-assist Bot reviewed May 4, 2026

View reviewed changes

Zhirui Liang and others added 8 commits May 4, 2026 00:47

[CI] Apply ruff format + fix lint errors

27bfb3b

[CI] Fix ty type-check errors

7235765

[Refactor] sweep_dc_locations.py: deduplicate against systems.py + ut…

ca43a56

…ils.py

[Revert] sweep_dc_locations.py: restore master's version

6237be5

For ieee13, generate additional overvoltage cases for train/test

1540fea

unify output path; revise document; add tensorboard to the rl optiona…

b9f40b9

…l-dependency extra

Refactors

847c111

Merge branch 'master' into feat/ppo-controller

20ccd55

Fix fmt

ac88b16

gemini-code-assist Bot reviewed May 11, 2026

View reviewed changes

Comment thread .gitignore Outdated

Remove personal stuff in .gitignore

06c3600

jaywonchung merged commit 2008715 into master May 11, 2026
5 checks passed

jaywonchung deleted the feat/ppo-controller branch May 11, 2026 02:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PPO RL controller, scenario library, and evaluation pipeline#12

Add PPO RL controller, scenario library, and evaluation pipeline#12
jaywonchung merged 14 commits into
masterfrom
feat/ppo-controller

ZhiruiLiang commented May 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jaywonchung commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ZhiruiLiang commented May 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jaywonchung commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants