commonpower.control.runners

Runners to manage training/deployment in systems with both RL and non-RL controllers.

Classes

BaseRunner

Base class for any runner for power system control with one or multiple agents.

BaseTrainer

Base class for any runner used for training one or multiple reinforcement learning (RL) agents.

DeploymentRunner

Runner for the deployment of multiple heterogeneous controllers (RL, optimal control).

MAPPOTrainer

Runner for training multiple heterogeneous agents with MAPPO/IPPO from the on-policy repository (https://github.com/marlbenchmark/on-policy/tree/main/onpolicy). Based on our BaseTrainer and our logging framework as well as the BaseRunner from the on-policy repository :param sys: power system to be controlled :type sys: System :param global_controller: instance of controller taking over control of all nodes that have not yet been assigned a controller. Mostly used to balance the system using a market node or a generator. Defaults to OptimalController("global"). :type global_controller: OptimalController :param alg_config: configuration for the RL algorithm and policy to be trained :type alg_config: MAPPOBaseConfig :param wrapper: wrapper for the environment that handles the RL agents during training (used for example for single-agent RL control). :type wrapper: gym.Wrapper :param logger: object for handling training logs :type logger: BaseLogger :param horizon: amount of time that the controller looks into the future :type horizon: timedelta :param episode_length: number of time steps to simulate before the system is reset during RL training if continuous_control=False :type episode_length: int :param dt: control time interval :type dt: timedelta :param continuous_control: whether to use an infinite control horizon :type continuous_control: bool :param history: logging :type history: ModelHistory :param solver: solver for optimization problem :type solver: OptSolver :param save_path: local path to folder in which the trained policy will be stored (as .zip file) after the training is finished :type save_path: str :param seed: seed for the global random number generator of numpy (we use np.random.seed(seed) instead :type seed: int :param of instantiating our own generator): :param normalize_actions: whether or not to normalize the action space :type normalize_actions: bool :param limited_date_range: limits the system's date range such that we only train over a specific interval :type limited_date_range: list.

SingleAgentTrainer

Runner for training a single RL agent (with algorithms from the StableBaselines 3 repository).