commonpower.control.runners.BaseTrainer

class BaseTrainer(sys: ~commonpower.core.System, global_controller: ~commonpower.control.controllers.OptimalController = <commonpower.control.controllers.OptimalController object>, wrapper: ~gymnasium.core.Wrapper | None = None, horizon: ~datetime.timedelta = datetime.timedelta(days=1), episode_length: int = 24, dt: ~datetime.timedelta = datetime.timedelta(seconds=3600), continuous_control: bool = False, history: ~commonpower.modeling.history.ModelHistory | None = None, solver: ~pyomo.opt.base.solvers.OptSolver = <pyomo.solvers.plugins.solvers.gurobi_direct.GurobiDirect object>, save_path: str = './saved_models/test_model', seed: int | None = None, normalize_actions: bool = True, limited_date_range: ~typing.List[~datetime.datetime] | None = None)[source]

Bases: BaseRunner

Base class for any runner used for training one or multiple reinforcement learning (RL) agents.

Parameters:
  • sys (System) – power system to be controlled

  • global_controller (OptimalController) – instance of controller taking over control of all nodes that have not yet been assigned a controller. Mostly used to balance the system using a market node or a generator. Defaults to OptimalController(“global”).

  • wrapper (gym.Wrapper) – wrapper for the environment that handles the RL agents during training (used for example for single-agent RL control).

  • horizon (timedelta) – amount of time that the controller looks into the future

  • episode_length (int) – number of time steps to simulate before the system is reset during RL training if continuous_control=False

  • dt (timedelta) – control time interval

  • continuous_control (bool) – whether to use an infinite control horizon

  • history (ModelHistory) – logger

  • solver (OptSolver) – solver for optimization problem

  • save_path (str) – local path to folder in which the trained policy will be stored (as .zip file) after the training is finished

  • seed (int) – seed for the global random number generator of numpy (we use np.random.seed(seed) instead

  • generator) (of instantiating our own)

  • normalize_actions (bool) – whether or not to normalize the action space

  • limited_date_range (list) – limits the system’s date range such that we only train over a specific interval

Returns:

BaseTrainer

Methods

finish_run

Terminates run.

prepare_run

In addition to the preparation in BaseRunner, we also instantiate an environment function as an API for the RL training.

run

Simulates the scenario for a given number of time steps.

set_start_time

Set start time from external.

system_feasible

Check whether the current system set-up is feasible.

prepare_run()[source]

In addition to the preparation in BaseRunner, we also instantiate an environment function as an API for the RL training.

Returns: None