pysim

ball_on_beam

class BallOnBeamDiscSim(*args, **kwargs)[source]

Bases: BallOnBeamSim, Serializable

Ball-on-beam simulation environment with discrete actions

Constructor

Parameters:
  • args – forwarded to BallOnBeamSim’s constructor

  • kwargs – forwarded to BallOnBeamSim’s constructor

name: str = 'bob-d'
class BallOnBeamSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: SimPyEnv, Serializable

Environment in which a ball rolls on a beam (1 dim). The ball is randomly initialized on the beam and is to be stabilized on the center of the beam. In this setup, the agent can control the torque applied to the beam.

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

classmethod get_nominal_domain_param() dict[source]

Get the nominal a.k.a. default domain parameters.

Note

This function is used to check which domain parameters exist.

name: str = 'bob'

base

class SimPyEnv(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: SimEnv, Serializable

Base class for simulated environments implemented in pure Python

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

property act_space: Space

Get the space of the actions.

property domain_param: dict

Get the environment’s domain parameters. If there are none, this method should return an emtpy dict. The domain parameters are synonymous to the parameters used by the simulator to run the physics simulation (e.g., masses, extents, or friction coefficients). This must include all parameters that can be randomized, but there might also be additional parameters that depend on the domain parameters.

property init_space: Space

Get the initial state space.

property obs_space: Space

Get the space of the observations (agent’s perception of the environment).

render(mode: RenderMode, render_step: int = 1)[source]

Visualize one time step of the simulation. The base version prints to console when the state exceeds its boundaries.

Parameters:
  • mode – render mode: console, video, or both

  • render_step – interval for rendering

reset(init_state: Optional[ndarray] = None, domain_param: Optional[dict] = None) ndarray[source]

Reset the environment to its initial state and optionally set different domain parameters.

Parameters:
  • init_state – set explicit initial state if not None. Must match init_space if any.

  • domain_param – set explicit domain parameters if not None

Return obs:

initial observation of the state.

property state_space: Space

Get the space of the states (used for describing the environment).

step(act: ndarray) tuple[source]

Perform one time step of the simulation or on the real-world device. When a terminal condition is met, the reset function is called.

Note

This function is responsible for limiting the actions, i.e. has to call limit_act().

Parameters:

act – action to be taken in the step

Return obs:

current observation of the environment

Return reward:

reward depending on the selected reward function

Return done:

indicates whether the episode has ended

Return env_info:

contains diagnostic information about the environment

property task: Task

Get the task describing what the agent should do in the environment.

gym_wrapper

class OpenAIGymWrapper(env: SimPyEnv)[source]

Bases: Env

A wrapper for simulation environments exposing an OpenAI Gym environment. Do not instantiate this yourself but rather use gym.make() like this:

sim_env = QQubeSwingUpSim(dt=1 / 100.0, max_steps=600)
gym.make("SimuRLacraSimEnv-v0", env=sim_env)

Initialize the environment. You are not supposed to call this function directly, but use gym.make(“SimuRLacraSimEnv-v0”, env=sim_env) where sim_env is your Pyrado environment.

Parameters:

env – Pyrado environment to wrap

action_space: Space[ActType]
close()[source]

Perform any necessary cleanup for the environment. Environments will automatically close() themselves when garbage collected or when the program exits.

static conv_space_from_pyrado(space: Space) Space[source]

Convert a Pyrado space to a gym space.

Parameters:

space – Pyrado space to convert

Returns:

OpenAI gym space

metadata: Dict[str, Any] = {'render.modes': ['human']}
observation_space: Space[ObsType]
render(mode: str = 'human')[source]

Renders the environment. The set of supported modes varies per environment (some environments do not support rendering). By convention, if mode is: human: render to the current display or terminal and return nothing. Usually for human consumption. rgb_array: return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image,

suitable for turning into a video.

ansi: return a string (str) or StringIO.StringIO containing a terminal-style text representation.

The text can include newlines and ANSI escape sequences (e.g. for colors).

Parameters:

mode – the mode to render with (currently ignored until other modes besides human are implemented)

reset() ndarray[source]

Resets the environment to an initial state and returns an initial observation.

Note

This function does not reset the environment’s random number generator(s); random variables in the environment’s state are sampled independently between multiple calls to reset(). In other words, each call of reset() yields an environment suitable for a new episode, independent of previous episodes.

Returns:

the initial observation

seed(seed: Optional[int] = None) Optional[List[int]][source]

Sets the seed for this environment’s random number generator(s).

Returns:

list of seeds used in this environment’s random number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to seed. Often, the main seed equals the provided seed, but this won’t be true if seed=None, for example.

step(act: ndarray) Tuple[ndarray, float, bool, dict][source]

Run one time step of the environment’s dynamics. When the end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Parameters:

act – action provided by the agent

Returns:

observation: agent’s current observation of the environment reward: reward returned by the environment done: whether the episode has ended, in which case further step() calls will return undefined results info: contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)

one_mass_oscillator

class OneMassOscillatorDomainParamEstimator(dt: float, dp_init: dict, num_epoch: int, batch_size: int)[source]

Bases: Module

Class to estimate the domain parameters of the OneMassOscillator environment

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(state: Tensor, act: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
update(rollouts: Sequence[StepSequence])[source]
class OneMassOscillatorDyn(dt: float)[source]

Bases: Serializable

Constructor

Parameters:

dt – simulation step size [s]

class OneMassOscillatorSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: SimPyEnv, Serializable

Model of a linear one-mass-oscillator (spring-mass-damper system) without gravity influence

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

classmethod get_nominal_domain_param() dict[source]

Get the nominal a.k.a. default domain parameters.

Note

This function is used to check which domain parameters exist.

name: str = 'omo'

openai_classical_control

class GymEnv(env_name: str)[source]

Bases: SimEnv, Serializable

A Wrapper to use the classical control environments of OpenAI Gym like Pyrado environments

Constructor

Note

Pyrado only supports the classical control environments from OpenAI Gym. See https://github.com/openai/gym/tree/master/gym/envs/classic_control

Parameters:

env_name – name of the OpenAI Gym environment, e.g. ‘MountainCar-v0’, ‘CartPole-v1’, ‘Acrobot-v1’, ‘MountainCarContinuous-v0’,’Pendulum-v0’

property act_space: Space

Get the space of the actions.

close()[source]

For compatibility to RealEnv with the wrappers which are subclasses of Env.

property domain_param: dict

Get the environment’s domain parameters. If there are none, this method should return an emtpy dict. The domain parameters are synonymous to the parameters used by the simulator to run the physics simulation (e.g., masses, extents, or friction coefficients). This must include all parameters that can be randomized, but there might also be additional parameters that depend on the domain parameters.

classmethod get_nominal_domain_param()[source]

Get the nominal a.k.a. default domain parameters.

Note

This function is used to check which domain parameters exist.

property init_space: None

Get the initial state space.

name: str = 'gym-cc'
property obs_space: Space

Get the space of the observations (agent’s perception of the environment).

render(mode: RenderMode = RenderMode(text=False, video=False, render=False), render_step: int = 1)[source]

Visualize one time step of the simulation. The base version prints to console when the state exceeds its boundaries.

Parameters:
  • mode – render mode: console, video, or both

  • render_step – interval for rendering

reset(init_state=None, domain_param=None)[source]

Reset the environment to its initial state and optionally set different domain parameters.

Parameters:
  • init_state – set explicit initial state if not None. Must match init_space if any.

  • domain_param – set explicit domain parameters if not None

Return obs:

initial observation of the state.

property state_space: Space

Get the space of the states (used for describing the environment).

step(act) tuple[source]

Perform one time step of the simulation or on the real-world device. When a terminal condition is met, the reset function is called.

Note

This function is responsible for limiting the actions, i.e. has to call limit_act().

Parameters:

act – action to be taken in the step

Return obs:

current observation of the environment

Return reward:

reward depending on the selected reward function

Return done:

indicates whether the episode has ended

Return env_info:

contains diagnostic information about the environment

property task

Get the task describing what the agent should do in the environment.

pandavis

class BallOnBeamVis(env: SimEnv, rendering: bool)[source]

Bases: PandaVis

Visualisation for the BallOnBeamSim class using panda3d

Constructor

Parameters:

env – environment to visualize

:param rendering

update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]

Updates the visualization with every call.

Parameters:

task – Needed by panda3d task manager.

Return Task.cont:

indicates that task should be called again next frame.

class OneMassOscillatorVis(env: SimEnv, rendering: bool)[source]

Bases: PandaVis

Visualisation for the OneMassOscillatorSim class using panda3d

Constructor

Parameters:

env – environment to visualize

update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]

Updates the visualization with every call.

Parameters:

task – Needed by panda3d task manager.

Return Task.cont:

indicates that task should be called again next frame.

class PandaVis(rendering: bool)[source]

Bases: ShowBase

Base class for all visualizations with panda3d

Constructor

Parameters:

rendering – boolean indicating whether to use RenderPipeline or default Panda3d as visualization-module.

draw_trace(point)[source]

Draws a line from the last point to the current point

Parameters:

point – Current position of pen. Needs 3 value vector.

reset()[source]

Resets the the visualization to a certain state, so that in can be run again. Removes the trace.

update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]

Updates the visualization with every call.

Parameters:

task – Needed by panda3d task manager.

Return Task.cont:

indicates that task should be called again next frame.

class PendulumVis(env: SimEnv, rendering: bool)[source]

Bases: PandaVis

Visualisation for the PendulumSim class using panda3d

Constructor

Parameters:

env – environment to visualize

update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]

Updates the visualization with every call.

Parameters:

task – Needed by panda3d task manager.

Return Task.cont:

indicates that task should be called again next frame.

class QBallBalancerVis(env: SimEnv, rendering: bool)[source]

Bases: PandaVis

Visualisation for the QBallBalancerSim class using panda3d

Constructor

Parameters:

env – environment to visualize

update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]

Updates the visualization with every call.

Parameters:

task – Needed by panda3d task manager.

Return Task.cont:

indicates that task should be called again next frame.

class QCartPoleVis(env: SimEnv, rendering: bool)[source]

Bases: PandaVis

Visualisation for the QCartPoleSim class using panda3d

Constructor

Parameters:

env – environment to visualize

update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]

Updates the visualization with every call.

Parameters:

task – Needed by panda3d task manager.

Return Task.cont:

indicates that task should be called again next frame.

class QQubeVis(env: SimEnv, rendering: bool)[source]

Bases: PandaVis

Visualisation for the QQubeSim class using panda3d

Constructor

Parameters:

env – environment to visualize

update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]

Updates the visualization with every call.

Parameters:

task – Needed by panda3d task manager.

Return Task.cont:

indicates that task should be called again next frame.

pendulum

class PendulumSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None, init_state: Optional[ndarray] = None)[source]

Bases: SimPyEnv, Serializable

Under-actuated inverted pendulum environment similar to the one from OpenAI Gym

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

  • init_state – set an pole angle and pole angular velocity for the SingularStateSpace

classmethod get_nominal_domain_param() dict[source]

Get the nominal a.k.a. default domain parameters.

Note

This function is used to check which domain parameters exist.

name: str = 'pend'
observe(state) ndarray[source]

Compute the (noise-free) observation from the current state.

Note

This method should be overwritten if the environment has a distinct observation space.

Parameters:

state – current state of the environment

Returns:

observation perceived to the agent

quanser_ball_balancer

class QBallBalancerKin(qbb, num_opt_iter=100, render_mode=RenderMode(text=False, video=False, render=False))[source]

Bases: Serializable

Calculates and visualizes the kinematics from the servo shaft angles (th_x, th_x) to the plate angles (a, b).

Constructor

Parameters:
  • qbb – QBallBalancerSim object

  • num_opt_iter – number of optimizer iterations for the IK

  • mode – the render mode: a for animating (pyplot), or `` for no animation

plate_ang(tip)[source]

Compute plate angle (alpha or beta) from the rod tip position which has been calculated from servo shaft angle (th_x or th_y) before. :return tip: 2D position of the rod tip in the sagittal plane (from the optimizer)

render(th, tip)[source]

Visualize using pyplot

Parameters:
  • th – angle of the servo

  • tip – 2D position of the rod tip in the sagittal plane (from the optimizer)

rod_tip(th)[source]

Get Cartesian coordinates of the rod tip for one servo.

Parameters:

th – current value of the respective servo shaft angle

Return tip:

2D position of the rod tip in the sagittal plane

class QBallBalancerSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None, simple_dynamics: bool = False, load_experimental_tholds: bool = True)[source]

Bases: SimPyEnv, Serializable

Environment in which a ball rolls on an actuated plate. The ball is randomly initialized on the plate and is to be stabilized on the center of the plate. The problem formulation treats this setup as 2 independent ball-on-beam problems. The plate is actuated via 2 servo motors that lift the plate.

Note

The dynamics are not the same as in the Quanser Workbook (2 DoF Ball-Balancer - Instructor). Here, we added the coriolis forces and linear-viscous friction. However, the 2 dim system is still modeled to be decoupled. This is the case, since the two rods (connected to the servos) are pushing the plate at the center lines. As a result, the angles alpha and beta are w.r.t. to the inertial frame, i.e. they are not 2 sequential rations.

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

  • simple_dynamics – if `True, use a dynamics model without Coriolis forces and without friction effects

  • load_experimental_tholds – use the voltage thresholds determined from experiments

classmethod get_nominal_domain_param() dict[source]

Get the nominal a.k.a. default domain parameters.

Note

This function is used to check which domain parameters exist.

classmethod get_voltage_tholds(load_experiments: bool = True) dict[source]

If available, the voltage thresholds computed from measurements, else use default values.

measured_tholds = {'voltage_thold_x_neg': -0.1, 'voltage_thold_x_pos': 0.28, 'voltage_thold_y_neg': -0.074, 'voltage_thold_y_pos': 0.28}
name: str = 'qbb'
reset(init_state: Optional[ndarray] = None, domain_param: Optional[dict] = None) ndarray[source]

Reset the environment to its initial state and optionally set different domain parameters.

Parameters:
  • init_state – set explicit initial state if not None. Must match init_space if any.

  • domain_param – set explicit domain parameters if not None

Return obs:

initial observation of the state.

quanser_cartpole

class QCartPoleSim(dt: float, max_steps: int, task_args: Optional[dict], long: bool, simple_dynamics: bool, wild_init: bool)[source]

Bases: SimPyEnv, Serializable

Base Environment for the Quanser Cart-Pole swing-up and stabilization task

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

  • long – set to True if using the long pole, else False

  • simple_dynamics – if True, use the simpler dynamics model from Quanser. If `False, use a dynamics model which includes friction

  • wild_init – if True the init state space is increased drastically, e.g. the initial pendulum angle can be in \([-\pi, +\pi]\). Only applicable to QCartPoleSwingUpSim.

classmethod get_nominal_domain_param(long: bool = False) dict[source]

Get the nominal a.k.a. default domain parameters.

Note

This function is used to check which domain parameters exist.

observe(state) ndarray[source]

Compute the (noise-free) observation from the current state.

Note

This method should be overwritten if the environment has a distinct observation space.

Parameters:

state – current state of the environment

Returns:

observation perceived to the agent

reset(init_state: Optional[ndarray] = None, domain_param: Optional[dict] = None) ndarray[source]

Reset the environment to its initial state and optionally set different domain parameters.

Parameters:
  • init_state – set explicit initial state if not None. Must match init_space if any.

  • domain_param – set explicit domain parameters if not None

Return obs:

initial observation of the state.

class QCartPoleStabSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None, long: bool = True, simple_dynamics: bool = True)[source]

Bases: QCartPoleSim, Serializable

Environment in which a pole has to be stabilized in the upright position (inverted pendulum) by moving a cart on a rail. The pole can rotate around an axis perpendicular to direction in which the cart moves.

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

  • long – set to True if using the long pole, else False

  • simple_dynamics – if True, use the simpler dynamics model from Quanser. If `False, use a dynamics model which includes friction

name: str = 'qcp-st'
class QCartPoleSwingUpSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None, long: bool = False, simple_dynamics: bool = False, wild_init: bool = True)[source]

Bases: QCartPoleSim, Serializable

Environment in which a pole has to be swung up and stabilized in the upright position (inverted pendulum) by moving a cart on a rail. The pole can rotate around an axis perpendicular to direction in which the cart moves.

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

  • long – set to True if using the long pole, else False

  • simple_dynamics – if True, use the simpler dynamics model from Quanser. If False, use a dynamics model which includes friction

  • wild_init – if True the init state space is increased drastically, e.g. the initial pendulum angle can be in \([-\pi, +\pi]\)

name: str = 'qcp-su'

quanser_qube

class QQubeSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: SimPyEnv, Serializable

Base Environment for the Quanser Qube swing-up and stabilization task

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

classmethod get_nominal_domain_param() dict[source]

Get the nominal a.k.a. default domain parameters.

Note

This function is used to check which domain parameters exist.

observe(state: ndarray) ndarray[source]

Compute the (noise-free) observation from the current state.

Note

This method should be overwritten if the environment has a distinct observation space.

Parameters:

state – current state of the environment

Returns:

observation perceived to the agent

class QQubeStabSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: QQubeSim

Environment which models Quanser’s Furuta pendulum called Quanser Qube. The goal is to stabilize the pendulum at the upright position (alpha = +-pi).

Note

This environment is only for testing purposes, or to find the PD gains for stabilizing the pendulum at the top.

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

name: str = 'qq-st'
class QQubeSwingUpSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]

Bases: QQubeSim

Environment which models Quanser’s Furuta pendulum called Quanser Qube. The goal is to swing up the pendulum and stabilize at the upright position (alpha = +-pi).

Constructor

Parameters:
  • dt – simulation step size [s]

  • max_steps – maximum number of simulation steps

  • task_args – arguments for the task construction

name: str = 'qq-su'

Module contents