commonpower.control.wrappers.SingleAgentWrapper
- class SingleAgentWrapper(env)[source]
Bases:
WrapperWrapper to standardize ControlEnv to the API for single-agent RL training with any RL algorithm from the StableBaselines 3 repository.
- Parameters:
env (ControlEnv) – power system environment with multi-agent API
- Returns:
SingleAgentWrapper
Methods
class_nameReturns the class name of the wrapper.
closeCloses the wrapper and
env.get_wrapper_attrGets an attribute from the wrapper and lower environments if name doesn't exist in this object.
renderUses the
render()of theenvthat can be overwritten to change the returned data.Reset the environment
Step function with the single-agent API (takes numpy array action and outputs numpy array observation)
wrapper_specGenerates a WrapperSpec for the wrappers.
Attributes
action_spaceReturn the
Envaction_spaceunless overwritten then the wrapperaction_spaceis used.metadataReturns the
Envmetadata.np_randomReturns the
Envnp_randomattribute.observation_spaceReturn the
Envobservation_spaceunless overwritten then the wrapperobservation_spaceis used.render_modeReturns the
Envrender_mode.reward_rangeReturn the
Envreward_rangeunless overwritten then the wrapperreward_rangeis used.specReturns the
Envspecattribute with the WrapperSpec if the wrapper inherits from EzPickle.unwrappedReturns the base environment of the wrapper.
- _unpack_obs(obs: dict) ndarray[source]
Convert dictionary of {agent_id: observation_dict} to flattened observation array.
- Parameters:
obs (dict) – observation dictionary {agent_id: observation_dict}
- Returns:
np.ndarray – flat array of observations
- reset(*, seed=None, options=None)[source]
Reset the environment
- Parameters:
seed – seed for the random number generator
options – not needed here
- Returns:
None
- step(action: ndarray) Tuple[ndarray, float, bool, bool, dict][source]
Step function with the single-agent API (takes numpy array action and outputs numpy array observation)
- Parameters:
action (np.ndarray) – action selected by the RL policy
- Returns:
Tuple –
- tuple containing:
single-agent observation (np.ndarray)
single-agent reward (float)
whether the environment is terminated (bool)
whether environment is truncated. In our case, the same as terminated (bool)
additional information (dict)