commonpower.control.logging_utils.callbacks.MARLBaseCallback

class MARLBaseCallback(verbose: int = 0)[source]

Bases: object

Base class for a multi-agent callback. Adapted from stable-baselines3 BaseCallback https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/callbacks.py

Parameters:

verbose (int) – Verbosity level: 0 for no output, 1 for info messages, 2 for debug messages

Methods

init_callback

Initialize the callback by saving references to the RL runner and the training environment for convenience.

on_rollout_end

Any operations the callback has to perform at the end of one training episode

on_rollout_start

on_step

This method will be called by the runner after each call to env.step().

on_training_end

Any operations the callback has to perform after the training is finished

on_training_start

Any operations the callback has to perform before the training starts

update_child_locals

Update the references to the local variables on sub callbacks.

update_locals

Update the references to the local variables.

update_num_timesteps

Attributes

runner

logger

_on_rollout_end() None[source]

At the end of one training episode, we want to log some information about the safety shield.

Returns:

None

_on_step() bool[source]

Internal operation that should be performed in each step

Returns:

(bool) – If the callback returns False, training is aborted early.

init_callback(runner: runners.BaseRunner) None[source]

Initialize the callback by saving references to the RL runner and the training environment for convenience.

on_rollout_end() None[source]

Any operations the callback has to perform at the end of one training episode

Returns:

None

on_step(num_timesteps: int) bool[source]

This method will be called by the runner after each call to env.step().

Parameters:

num_timesteps (int) – Number of environments * number of steps per env

Returns:

If the callback returns False, training is aborted early.

on_training_end() None[source]

Any operations the callback has to perform after the training is finished

Returns:

None

on_training_start(locals_: Dict[str, Any], globals_: Dict[str, Any], num_timesteps: int = 0) None[source]

Any operations the callback has to perform before the training starts

Parameters:
  • locals (Dict[str, Any]) – local variables

  • globals (Dict[str, Any]) – global variables

  • num_timesteps (int) – current training progress

Returns:

None

update_child_locals(locals_: Dict[str, Any]) None[source]

Update the references to the local variables on sub callbacks.

Parameters:
  • (Dict[str – the local variables during rollout collection

  • Any]) – the local variables during rollout collection

update_locals(locals_: Dict[str, Any]) None[source]

Update the references to the local variables.

Parameters:
  • (Dict[str – the local variables during rollout collection

  • Any]) – the local variables during rollout collection

update_num_timesteps(num_timesteps: int = 0) None[source]
Parameters:

num_timesteps (int) – number of environments * number of time steps (training progress)

Returns:

None