commonpower.control.runners

Runners to manage training/deployment in systems with both RL and non-RL controllers.

Classes

`BaseRunner`	Base class for any runner for power system control with one or multiple agents.
`BaseTrainer`	Base class for any runner used for training one or multiple reinforcement learning (RL) agents.
`DeploymentRunner`	Runner for the deployment of multiple heterogeneous controllers (RL, optimal control).
`MAPPOTrainer`	Runner for training multiple heterogeneous agents with MAPPO/IPPO from the on-policy repository (https://github.com/marlbenchmark/on-policy/tree/main/onpolicy). Based on our BaseTrainer and our logging framework as well as the BaseRunner from the on-policy repository :param sys: power system to be controlled :type sys: System :param global_controller: instance of controller taking over control of all nodes that have not yet been assigned a controller. Mostly used to balance the system using a market node or a generator. Defaults to OptimalController("global"). :type global_controller: OptimalController :param alg_config: configuration for the RL algorithm and policy to be trained :type alg_config: MAPPOBaseConfig :param wrapper: wrapper for the environment that handles the RL agents during training (used for example for single-agent RL control). :type wrapper: gym.Wrapper :param logger: object for handling training logs :type logger: BaseLogger :param horizon: amount of time that the controller looks into the future :type horizon: timedelta :param episode_length: number of time steps to simulate before the system is reset during RL training if continuous_control=False :type episode_length: int :param dt: control time interval :type dt: timedelta :param continuous_control: whether to use an infinite control horizon :type continuous_control: bool :param history: logging :type history: ModelHistory :param solver: solver for optimization problem :type solver: OptSolver :param save_path: local path to folder in which the trained policy will be stored (as .zip file) after the training is finished :type save_path: str :param seed: seed for the global random number generator of numpy (we use np.random.seed(seed) instead :type seed: int :param of instantiating our own generator): :param normalize_actions: whether or not to normalize the action space :type normalize_actions: bool :param limited_date_range: limits the system's date range such that we only train over a specific interval :type limited_date_range: list.
`SingleAgentTrainer`	Runner for training a single RL agent (with algorithms from the StableBaselines 3 repository).