commonpower.control.wrappers.SingleAgentWrapper

class SingleAgentWrapper(env)[source]

Bases: Wrapper

Wrapper to standardize ControlEnv to the API for single-agent RL training with any RL algorithm from the StableBaselines 3 repository.

Parameters:: env (ControlEnv) – power system environment with multi-agent API
Returns:: SingleAgentWrapper

Methods

`class_name`	Returns the class name of the wrapper.
`close`	Closes the wrapper and `env`.
`get_wrapper_attr`	Gets an attribute from the wrapper and lower environments if name doesn't exist in this object.
`render`	Uses the `render()` of the `env` that can be overwritten to change the returned data.
`reset`	Reset the environment
`step`	Step function with the single-agent API (takes numpy array action and outputs numpy array observation)
`wrapper_spec`	Generates a WrapperSpec for the wrappers.

Attributes

`action_space`	Return the `Env` `action_space` unless overwritten then the wrapper `action_space` is used.
`metadata`	Returns the `Env` `metadata`.
`np_random`	Returns the `Env` `np_random` attribute.
`observation_space`	Return the `Env` `observation_space` unless overwritten then the wrapper `observation_space` is used.
`render_mode`	Returns the `Env` `render_mode`.
`reward_range`	Return the `Env` `reward_range` unless overwritten then the wrapper `reward_range` is used.
`spec`	Returns the `Env` `spec` attribute with the WrapperSpec if the wrapper inherits from EzPickle.
`unwrapped`	Returns the base environment of the wrapper.

_unpack_obs(obs: dict) → ndarray[source]

Convert dictionary of {agent_id: observation_dict} to flattened observation array.

Parameters:: obs (dict) – observation dictionary {agent_id: observation_dict}
Returns:: np.ndarray – flat array of observations

reset(*, seed=None, options=None)[source]

Reset the environment

Parameters:

Returns:

None

step(action: ndarray) → Tuple[ndarray, float, bool, bool, dict][source]

Step function with the single-agent API (takes numpy array action and outputs numpy array observation)

Parameters:

action (np.ndarray) – action selected by the RL policy

Returns:

Tuple –

tuple containing: