environments

base

class Env(dt: Union[int, float], max_steps: Union[int, float] = inf)[source]

Bases: ABC, Serializable

Base class of all environments in Pyrado. Uses Serializable to facilitate proper serialization.

Constructor

Parameters:

dt – integration step size in seconds, default value is used for for one-step environments
max_steps – max number of simulation time steps

abstract property act_space: Space: Get the space of the actions.

close()[source]: Disconnect from the device.

property curr_step: int: Get the number of the current simulation step (0 for the initial step).

property dt: float: Get the time step size.

limit_act(act: ndarray) → ndarray[source]

Clip the actions according to the environment’s action space. Note, this also affects the exploration.

Parameters:: act – unbounded action
Returns:: bounded action

property max_steps: Union[int, float]

Get the maximum number of simulation steps.

Note

The step count should always be an integer. Some environments have no maximum step size. For these, float(‘Inf’) should be used, since it is the only value larger then any int.

Returns:: maximum number of time steps before the environment terminates

name: str = None

abstract property obs_space: Space: Get the space of the observations (agent’s perception of the environment).

observe(state: ndarray) → ndarray[source]

Compute the (noise-free) observation from the current state.

Note

This method should be overwritten if the environment has a distinct observation space.

Parameters:: state – current state of the environment
Returns:: observation perceived to the agent

abstract render(mode: RenderMode, render_step: int = 1)[source]

Visualize one time step.

Parameters:

mode – render mode: console, video, or both
render_step – interval for rendering

abstract reset(init_state: Optional[ndarray] = None, domain_param: Optional[dict] = None) → ndarray[source]

Reset the environment to its initial state and optionally set different domain parameters.

Parameters:

init_state – set explicit initial state if not None. Must match init_space if any.
domain_param – set explicit domain parameters if not None

Return obs:

initial observation of the state.

property spec: EnvSpec: Get the environment specification (generated on call).

abstract property state_space: Space: Get the space of the states (used for describing the environment).

abstract step(act: ndarray) → tuple[source]

Perform one time step of the simulation or on the real-world device. When a terminal condition is met, the reset function is called.

Note

This function is responsible for limiting the actions, i.e. has to call limit_act().

Parameters:: act – action to be taken in the step
Return obs:: current observation of the environment
Return reward:: reward depending on the selected reward function
Return done:: indicates whether the episode has ended
Return env_info:: contains diagnostic information about the environment

abstract property task: Task: Get the task describing what the agent should do in the environment.

real_base

class RealEnv(dt: Union[int, float], max_steps: Union[int, float] = inf)[source]

Bases: Env, ABC

Base class of all real-world environments in Pyrado

Note

So far, there is no difference to the Env class. However, this might change, plus it is useful for checking environments’ types.

Constructor

Parameters:

dt – integration step size in seconds, default value is used for for one-step environments
max_steps – max number of simulation time steps

sim_base

class SimEnv(dt: Union[int, float], max_steps: Union[int, float] = inf)[source]

Bases: Env, ABC, Serializable

Base class of all simulated environments in Pyrado. Uses Serializable to facilitate proper serialization. The domain parameters are automatically part of the serialized state.

Constructor

Parameters:

dt – integration step size in seconds, default value is used for for one-step environments
max_steps – max number of simulation time steps

close()[source]: For compatibility to RealEnv with the wrappers which are subclasses of Env.

abstract property domain_param: dict: Get the environment’s domain parameters. If there are none, this method should return an emtpy dict. The domain parameters are synonymous to the parameters used by the simulator to run the physics simulation (e.g., masses, extents, or friction coefficients). This must include all parameters that can be randomized, but there might also be additional parameters that depend on the domain parameters.

abstract classmethod get_nominal_domain_param() → dict[source]: Get the nominal a.k.a. default domain parameters.

Note

This function is used to check which domain parameters exist.

abstract property init_space: Space: Get the initial state space.

abstract render(mode: RenderMode, render_step: int = 1)[source]

Visualize one time step of the simulation. The base version prints to console when the state exceeds its boundaries.

Parameters:

mode – render mode: console, video, or both
render_step – interval for rendering

property supported_domain_param: Iterable: Get an iterable of all supported domain parameters. The default implementation takes the keys of get_nominal_domain_param(). The domain parameters are automatically stored in attributes prefixed with ‘_’.

environments

base

real_base

sim_base

Module contents