

class MujocoSimEnv(model_path: str, frame_skip: int = 1, dt: Optional[float] = None, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: SimEnv, ABC, Serializable

Base class for MuJoCo environments. Uses Serializable to facilitate proper serialization.


  • model_path – path to the MuJoCo xml model config file

  • frame_skip – number of simulation frames for which the same action is held, results in a multiplier of the time step size dt

  • dt – by default the time step size is the one from the mujoco config file multiplied by the number of frame skips (legacy from OpenAI environments). By passing an explicit dt value, this can be overwritten. Possible use case if if you know that you recorded a trajectory with a specific dt.

  • max_steps – max number of simulation time steps

  • task_args – arguments for the task construction, e.g dict(fwd_rew_weight=1.)

abstract property act_space: Space

Get the space of the actions.


Configure the camera when the viewer is initialized. You need to set self.camera_config before.

property domain_param: dict

Get the environment’s domain parameters. If there are none, this method should return an emtpy dict. The domain parameters are synonymous to the parameters used by the simulator to run the physics simulation (e.g., masses, extents, or friction coefficients). This must include all parameters that can be randomized, but there might also be additional parameters that depend on the domain parameters.

property init_space: Space

Get the initial state space.

abstract property obs_space: Space

Get the space of the observations (agent’s perception of the environment).

render(mode: RenderMode = RenderMode(text=False, video=False, render=False), render_step: int = 1)[source]

Visualize one time step of the simulation. The base version prints to console when the state exceeds its boundaries.

  • mode – render mode: console, video, or both

  • render_step – interval for rendering

reset(init_state: Optional[ndarray] = None, domain_param: Optional[dict] = None) ndarray[source]

Reset the environment to its initial state and optionally set different domain parameters.

  • init_state – set explicit initial state if not None. Must match init_space if any.

  • domain_param – set explicit domain parameters if not None

Return obs:

initial observation of the state.

abstract property state_space: Space

Get the space of the states (used for describing the environment).

step(act: ndarray) tuple[source]

Perform one time step of the simulation or on the real-world device. When a terminal condition is met, the reset function is called.


This function is responsible for limiting the actions, i.e. has to call limit_act().


act – action to be taken in the step

Return obs:

current observation of the environment

Return reward:

reward depending on the selected reward function

Return done:

indicates whether the episode has ended

Return env_info:

contains diagnostic information about the environment

property task: Task

Get the task describing what the agent should do in the environment.


class AntSim(frame_skip: int = 5, dt: Optional[float] = None, max_steps: Optional[int] = 1000, task_args: Optional[dict] = None)[source]

Bases: MujocoSimEnv, Serializable

The Ant (v3) MuJoCo simulation environment where a four-legged creature walks as fast as possible.


The OpenAI Gym variant considers this task solved at a reward over 6000 (


  • frame_skip – number of simulation frames for which the same action is held, results in a multiplier of the time step size dt

  • dt – by default the time step size is the one from the mujoco config file multiplied by the number of frame skips (legacy from OpenAI environments). By passing an explicit dt value, this can be overwritten. Possible use case if if you know that you recorded a trajectory with a specific dt.

  • max_steps – max number of simulation time steps

  • task_args – arguments for the task construction, e.g dict(fwd_rew_weight=1.)

property act_space: Space

Get the space of the actions.

property contact_forces
classmethod get_nominal_domain_param() dict[source]

Get the nominal a.k.a. default domain parameters.


This function is used to check which domain parameters exist.

name: str = 'ant'
property obs_space: Space

Get the space of the observations (agent’s perception of the environment).

observe(state: ndarray) ndarray[source]

Compute the (noise-free) observation from the current state.


This method should be overwritten if the environment has a distinct observation space.


state – current state of the environment


observation perceived to the agent

property state_space: Space

Get the space of the states (used for describing the environment).


class HalfCheetahSim(frame_skip: int = 5, dt: Optional[float] = None, max_steps: int = 1000, task_args: Optional[dict] = None)[source]

Bases: MujocoSimEnv, Serializable

The Half-Cheetah (v3) MuJoCo simulation environment where a planar cheetah-like robot tries to run forward.


The OpenAI Gym variant considers this task solved at a reward over 4800 (


  • frame_skip – number of simulation frames for which the same action is held, results in a multiplier of the time step size dt

  • dt – by default the time step size is the one from the mujoco config file multiplied by the number of frame skips (legacy from OpenAI environments). By passing an explicit dt value, this can be overwritten. Possible use case if if you know that you recorded a trajectory with a specific dt.

  • max_steps – max number of simulation time steps

  • task_args – arguments for the task construction, e.g dict(fwd_rew_weight=1.)

property act_space: Space

Get the space of the actions.

classmethod get_nominal_domain_param() dict[source]

Get the nominal a.k.a. default domain parameters.


This function is used to check which domain parameters exist.

name: str = 'cth'
property obs_space: Space

Get the space of the observations (agent’s perception of the environment).

observe(state: ndarray) ndarray[source]

Compute the (noise-free) observation from the current state.


This method should be overwritten if the environment has a distinct observation space.


state – current state of the environment


observation perceived to the agent

property state_space: Space

Get the space of the states (used for describing the environment).


class HopperSim(frame_skip: int = 4, dt: Optional[float] = None, max_steps: int = 1000, task_args: Optional[dict] = None)[source]

Bases: MujocoSimEnv, Serializable

The Hopper (v3) MuJoCo simulation environment where a planar simplified one-legged robot tries to run forward.


The OpenAI Gym variant considers this task solved at a reward over 3800 (


In contrast to the OpenAI Gym MoJoCo environments, Pyrado enables the randomization of the hoppers “healthy” state range. Moreover, the state space is constrained to the this part of the state space. In the original environment, the terminate_when_unhealthy is True by default.


  • frame_skip – number of simulation frames for which the same action is held, results in a multiplier of the time step size dt

  • dt – by default the time step size is the one from the mujoco config file multiplied by the number of frame skips (legacy from OpenAI environments). By passing an explicit dt value, this can be overwritten. Possible use case if if you know that you recorded a trajectory with a specific dt.

  • max_steps – max number of simulation time steps

  • task_args – arguments for the task construction, e.g dict(fwd_rew_weight=1.)

property act_space: Space

Get the space of the actions.

classmethod get_nominal_domain_param() dict[source]

Get the nominal a.k.a. default domain parameters.


This function is used to check which domain parameters exist.

name: str = 'hop'
property obs_space: Space

Get the space of the observations (agent’s perception of the environment).

observe(state: ndarray) ndarray[source]

Compute the (noise-free) observation from the current state.


This method should be overwritten if the environment has a distinct observation space.


state – current state of the environment


observation perceived to the agent

property state_space: Space

Get the space of the states (used for describing the environment).


class HumanoidSim(frame_skip: int = 5, dt: Optional[float] = None, max_steps: Optional[int] = 1000, task_args: Optional[dict] = None)[source]

Bases: MujocoSimEnv, Serializable

The Humanoid (v3) MuJoCo simulation environment where a humanoid robot tries to run forward.


  • frame_skip – number of simulation frames for which the same action is held, results in a multiplier of the time step size dt

  • dt – by default the time step size is the one from the mujoco config file multiplied by the number of frame skips (legacy from OpenAI environments). By passing an explicit dt value, this can be overwritten. Possible use case if if you know that you recorded a trajectory with a specific dt.

  • max_steps – max number of simulation time steps

  • task_args – arguments for the task construction, e.g dict(fwd_rew_weight=1.)

property act_space: Space

Get the space of the actions.

classmethod get_nominal_domain_param() dict[source]

Get the nominal a.k.a. default domain parameters.


This function is used to check which domain parameters exist.

name: str = 'hum'
property obs_space: Space

Get the space of the observations (agent’s perception of the environment).

observe(state: ndarray) ndarray[source]

Compute the (noise-free) observation from the current state.


This method should be overwritten if the environment has a distinct observation space.


state – current state of the environment


observation perceived to the agent

property state_space: Space

Get the space of the states (used for describing the environment).


class QQubeMjSim(frame_skip: int = 4, dt: Optional[float] = None, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: MujocoSimEnv, Serializable


  • frame_skip – number of simulation frames for which the same action is held, results in a multiplier of the time step size dt

  • dt – by default the time step size is the one from the mujoco config file multiplied by the number of frame skips (legacy from OpenAI environments). By passing an explicit dt value, this can be overwritten. Possible use case if if you know that you recorded a trajectory with a specific dt.

  • max_steps – max number of simulation time steps

  • task_args – arguments for the task construction

property act_space: Space

Get the space of the actions.

classmethod get_nominal_domain_param() dict[source]

Get the nominal a.k.a. default domain parameters.


This function is used to check which domain parameters exist.

property obs_space: Space

Get the space of the observations (agent’s perception of the environment).

observe(state: ndarray) ndarray[source]

Compute the (noise-free) observation from the current state.


This method should be overwritten if the environment has a distinct observation space.


state – current state of the environment


observation perceived to the agent

class QQubeStabMjSim(frame_skip: int = 4, dt: Optional[float] = None, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: QQubeMjSim


  • frame_skip – number of simulation frames for which the same action is held, results in a multiplier of the time step size dt

  • dt – by default the time step size is the one from the mujoco config file multiplied by the number of frame skips (legacy from OpenAI environments). By passing an explicit dt value, this can be overwritten. Possible use case if if you know that you recorded a trajectory with a specific dt.

  • max_steps – max number of simulation time steps

  • task_args – arguments for the task construction

property init_space: Space

Get the initial state space.

name: str = 'qq-mj-st'
property state_space: Space

Get the space of the states (used for describing the environment).

class QQubeSwingUpMjSim(frame_skip: int = 4, dt: Optional[float] = None, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: QQubeMjSim


  • frame_skip – number of simulation frames for which the same action is held, results in a multiplier of the time step size dt

  • dt – by default the time step size is the one from the mujoco config file multiplied by the number of frame skips (legacy from OpenAI environments). By passing an explicit dt value, this can be overwritten. Possible use case if if you know that you recorded a trajectory with a specific dt.

  • max_steps – max number of simulation time steps

  • task_args – arguments for the task construction

property init_space: Space

Get the initial state space.

name: str = 'qq-mj-su'
property state_space: Space

Get the space of the states (used for describing the environment).


class WAMSim(num_dof: int, model_path: str, frame_skip: int = 4, dt: Optional[float] = None, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: MujocoSimEnv, ABC, Serializable

Base class for WAM robotic arm from Barrett technologies


  • num_dof – number of degrees of freedom (4 or 7), depending on which Barrett WAM setup being used

  • model_path – path to the MuJoCo xml model config file

  • frame_skip – number of simulation frames for which the same action is held, results in a multiplier of the time step size dt

  • dt – by default the time step size is the one from the mujoco config file multiplied by the number of frame skips (legacy from OpenAI environments). By passing an explicit dt value, this can be overwritten. Possible use case if if you know that you recorded a trajectory with a specific dt.

  • max_steps – max number of simulation time steps

  • task_args – arguments for the task construction

abstract property act_space: Space

Get the space of the actions.

classmethod get_nominal_domain_param(num_dof: int = 7) dict[source]

Get the nominal a.k.a. default domain parameters.


This function is used to check which domain parameters exist.

property num_dof: int

Get the number of degrees of freedom.

abstract property obs_space: Space

Get the space of the observations (agent’s perception of the environment).

abstract property state_space: Space

Get the space of the states (used for describing the environment).

property torque_space: Space

Get the space of joint torques.



Module contents