Mujoco Environment¶
MujocoEnv Interface¶
Documentation
- class omnisafe.envs.mujoco_env.MujocoEnv(env_id, num_envs=1, device='cpu', **kwargs)[source]¶
Gymnasium Mujoco environment.
- Variables:
need_auto_reset_wrapper (bool) – Whether to use auto reset wrapper.
need_time_limit_wrapper (bool) – Whether to use time limit wrapper.
Initialize the environment.
- Parameters:
env_id (str) – Environment id.
num_envs (int, optional) – Number of environments. Defaults to 1.
device (torch.device, optional) – Device to store the data. Defaults to ‘cpu’.
- Keyword Arguments:
render_mode (str, optional) – The render mode, ranging from
human,rgb_array,rgb_array_list. Defaults torgb_array.camera_name (str, optional) – The camera name.
camera_id (int, optional) – The camera id.
width (int, optional) – The width of the rendered image. Defaults to 256.
height (int, optional) – The height of the rendered image. Defaults to 256.
- property max_episode_steps: int¶
The max steps per episode.
- reset(seed=None, options=None)[source]¶
Reset the environment.
- Parameters:
seed (int, optional) – The random seed. Defaults to None.
options (dict[str, Any], optional) – The options for the environment. Defaults to None.
- Returns:
observation – Agent’s observation of the current environment.
info – Auxiliary diagnostic information (helpful for debugging, and sometimes learning).
- Return type:
tuple[torch.Tensor, dict]
- set_seed(seed)[source]¶
Set the seed for the environment.
- Parameters:
seed (int) – Seed to set.
- Return type:
None
- step(action)[source]¶
Step the environment.
Note
OmniSafe use auto reset wrapper to reset the environment when the episode is terminated. So the
obswill be the first observation of the next episode. And the truefinal_observationininfowill be stored in thefinal_observationkey ofinfo.- Parameters:
action (torch.Tensor) – Action to take.
- Returns:
observation – Agent’s observation of the current environment.
reward – Amount of reward returned after previous action.
cost – Amount of cost returned after previous action.
terminated – Whether the episode has ended.
truncated – Whether the episode has been truncated due to a time limit.
info – Auxiliary diagnostic information (helpful for debugging, and sometimes learning).
- Return type:
tuple[Tensor,Tensor,Tensor,Tensor,Tensor,dict[str,Any]]