pysim
ball_on_beam
- class BallOnBeamDiscSim(*args, **kwargs)[source]
Bases:
BallOnBeamSim
,Serializable
Ball-on-beam simulation environment with discrete actions
Constructor
- Parameters:
args – forwarded to BallOnBeamSim’s constructor
kwargs – forwarded to BallOnBeamSim’s constructor
- name: str = 'bob-d'
- class BallOnBeamSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]
Bases:
SimPyEnv
,Serializable
Environment in which a ball rolls on a beam (1 dim). The ball is randomly initialized on the beam and is to be stabilized on the center of the beam. In this setup, the agent can control the torque applied to the beam.
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
- classmethod get_nominal_domain_param() dict [source]
Get the nominal a.k.a. default domain parameters.
Note
This function is used to check which domain parameters exist.
- name: str = 'bob'
base
- class SimPyEnv(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]
Bases:
SimEnv
,Serializable
Base class for simulated environments implemented in pure Python
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
- property domain_param: dict
Get the environment’s domain parameters. If there are none, this method should return an emtpy dict. The domain parameters are synonymous to the parameters used by the simulator to run the physics simulation (e.g., masses, extents, or friction coefficients). This must include all parameters that can be randomized, but there might also be additional parameters that depend on the domain parameters.
- property obs_space: Space
Get the space of the observations (agent’s perception of the environment).
- render(mode: RenderMode, render_step: int = 1)[source]
Visualize one time step of the simulation. The base version prints to console when the state exceeds its boundaries.
- Parameters:
mode – render mode: console, video, or both
render_step – interval for rendering
- reset(init_state: Optional[ndarray] = None, domain_param: Optional[dict] = None) ndarray [source]
Reset the environment to its initial state and optionally set different domain parameters.
- Parameters:
init_state – set explicit initial state if not None. Must match init_space if any.
domain_param – set explicit domain parameters if not None
- Return obs:
initial observation of the state.
- step(act: ndarray) tuple [source]
Perform one time step of the simulation or on the real-world device. When a terminal condition is met, the reset function is called.
Note
This function is responsible for limiting the actions, i.e. has to call limit_act().
- Parameters:
act – action to be taken in the step
- Return obs:
current observation of the environment
- Return reward:
reward depending on the selected reward function
- Return done:
indicates whether the episode has ended
- Return env_info:
contains diagnostic information about the environment
gym_wrapper
- class OpenAIGymWrapper(env: SimPyEnv)[source]
Bases:
Env
A wrapper for simulation environments exposing an OpenAI Gym environment. Do not instantiate this yourself but rather use gym.make() like this:
sim_env = QQubeSwingUpSim(dt=1 / 100.0, max_steps=600) gym.make("SimuRLacraSimEnv-v0", env=sim_env)
Initialize the environment. You are not supposed to call this function directly, but use gym.make(“SimuRLacraSimEnv-v0”, env=sim_env) where sim_env is your Pyrado environment.
- Parameters:
env – Pyrado environment to wrap
- action_space: Space[ActType]
- close()[source]
Perform any necessary cleanup for the environment. Environments will automatically close() themselves when garbage collected or when the program exits.
- static conv_space_from_pyrado(space: Space) Space [source]
Convert a Pyrado space to a gym space.
- Parameters:
space – Pyrado space to convert
- Returns:
OpenAI gym space
- metadata: Dict[str, Any] = {'render.modes': ['human']}
- observation_space: Space[ObsType]
- render(mode: str = 'human')[source]
Renders the environment. The set of supported modes varies per environment (some environments do not support rendering). By convention, if mode is: human: render to the current display or terminal and return nothing. Usually for human consumption. rgb_array: return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image,
suitable for turning into a video.
- ansi: return a string (str) or StringIO.StringIO containing a terminal-style text representation.
The text can include newlines and ANSI escape sequences (e.g. for colors).
- Parameters:
mode – the mode to render with (currently ignored until other modes besides human are implemented)
- reset() ndarray [source]
Resets the environment to an initial state and returns an initial observation.
Note
This function does not reset the environment’s random number generator(s); random variables in the environment’s state are sampled independently between multiple calls to reset(). In other words, each call of reset() yields an environment suitable for a new episode, independent of previous episodes.
- Returns:
the initial observation
- seed(seed: Optional[int] = None) Optional[List[int]] [source]
Sets the seed for this environment’s random number generator(s).
- Returns:
list of seeds used in this environment’s random number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to seed. Often, the main seed equals the provided seed, but this won’t be true if seed=None, for example.
- step(act: ndarray) Tuple[ndarray, float, bool, dict] [source]
Run one time step of the environment’s dynamics. When the end of episode is reached, you are responsible for calling reset() to reset this environment’s state.
- Parameters:
act – action provided by the agent
- Returns:
observation: agent’s current observation of the environment reward: reward returned by the environment done: whether the episode has ended, in which case further step() calls will return undefined results info: contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
one_mass_oscillator
- class OneMassOscillatorDomainParamEstimator(dt: float, dp_init: dict, num_epoch: int, batch_size: int)[source]
Bases:
Module
Class to estimate the domain parameters of the OneMassOscillator environment
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(state: Tensor, act: Tensor) Tensor [source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- update(rollouts: Sequence[StepSequence])[source]
- class OneMassOscillatorDyn(dt: float)[source]
Bases:
Serializable
Constructor
- Parameters:
dt – simulation step size [s]
- class OneMassOscillatorSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]
Bases:
SimPyEnv
,Serializable
Model of a linear one-mass-oscillator (spring-mass-damper system) without gravity influence
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
- classmethod get_nominal_domain_param() dict [source]
Get the nominal a.k.a. default domain parameters.
Note
This function is used to check which domain parameters exist.
- name: str = 'omo'
openai_classical_control
- class GymEnv(env_name: str)[source]
Bases:
SimEnv
,Serializable
A Wrapper to use the classical control environments of OpenAI Gym like Pyrado environments
Constructor
Note
Pyrado only supports the classical control environments from OpenAI Gym. See https://github.com/openai/gym/tree/master/gym/envs/classic_control
- Parameters:
env_name – name of the OpenAI Gym environment, e.g. ‘MountainCar-v0’, ‘CartPole-v1’, ‘Acrobot-v1’, ‘MountainCarContinuous-v0’,’Pendulum-v0’
- property domain_param: dict
Get the environment’s domain parameters. If there are none, this method should return an emtpy dict. The domain parameters are synonymous to the parameters used by the simulator to run the physics simulation (e.g., masses, extents, or friction coefficients). This must include all parameters that can be randomized, but there might also be additional parameters that depend on the domain parameters.
- classmethod get_nominal_domain_param()[source]
Get the nominal a.k.a. default domain parameters.
Note
This function is used to check which domain parameters exist.
- property init_space: None
Get the initial state space.
- name: str = 'gym-cc'
- property obs_space: Space
Get the space of the observations (agent’s perception of the environment).
- render(mode: RenderMode = RenderMode(text=False, video=False, render=False), render_step: int = 1)[source]
Visualize one time step of the simulation. The base version prints to console when the state exceeds its boundaries.
- Parameters:
mode – render mode: console, video, or both
render_step – interval for rendering
- reset(init_state=None, domain_param=None)[source]
Reset the environment to its initial state and optionally set different domain parameters.
- Parameters:
init_state – set explicit initial state if not None. Must match init_space if any.
domain_param – set explicit domain parameters if not None
- Return obs:
initial observation of the state.
- step(act) tuple [source]
Perform one time step of the simulation or on the real-world device. When a terminal condition is met, the reset function is called.
Note
This function is responsible for limiting the actions, i.e. has to call limit_act().
- Parameters:
act – action to be taken in the step
- Return obs:
current observation of the environment
- Return reward:
reward depending on the selected reward function
- Return done:
indicates whether the episode has ended
- Return env_info:
contains diagnostic information about the environment
- property task
Get the task describing what the agent should do in the environment.
pandavis
- class BallOnBeamVis(env: SimEnv, rendering: bool)[source]
Bases:
PandaVis
Visualisation for the BallOnBeamSim class using panda3d
Constructor
- Parameters:
env – environment to visualize
:param rendering
- update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]
Updates the visualization with every call.
- Parameters:
task – Needed by panda3d task manager.
- Return Task.cont:
indicates that task should be called again next frame.
- class OneMassOscillatorVis(env: SimEnv, rendering: bool)[source]
Bases:
PandaVis
Visualisation for the OneMassOscillatorSim class using panda3d
Constructor
- Parameters:
env – environment to visualize
- update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]
Updates the visualization with every call.
- Parameters:
task – Needed by panda3d task manager.
- Return Task.cont:
indicates that task should be called again next frame.
- class PandaVis(rendering: bool)[source]
Bases:
ShowBase
Base class for all visualizations with panda3d
Constructor
- Parameters:
rendering – boolean indicating whether to use RenderPipeline or default Panda3d as visualization-module.
- draw_trace(point)[source]
Draws a line from the last point to the current point
- Parameters:
point – Current position of pen. Needs 3 value vector.
- reset()[source]
Resets the the visualization to a certain state, so that in can be run again. Removes the trace.
- update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]
Updates the visualization with every call.
- Parameters:
task – Needed by panda3d task manager.
- Return Task.cont:
indicates that task should be called again next frame.
- class PendulumVis(env: SimEnv, rendering: bool)[source]
Bases:
PandaVis
Visualisation for the PendulumSim class using panda3d
Constructor
- Parameters:
env – environment to visualize
- update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]
Updates the visualization with every call.
- Parameters:
task – Needed by panda3d task manager.
- Return Task.cont:
indicates that task should be called again next frame.
- class QBallBalancerVis(env: SimEnv, rendering: bool)[source]
Bases:
PandaVis
Visualisation for the QBallBalancerSim class using panda3d
Constructor
- Parameters:
env – environment to visualize
- update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]
Updates the visualization with every call.
- Parameters:
task – Needed by panda3d task manager.
- Return Task.cont:
indicates that task should be called again next frame.
- class QCartPoleVis(env: SimEnv, rendering: bool)[source]
Bases:
PandaVis
Visualisation for the QCartPoleSim class using panda3d
Constructor
- Parameters:
env – environment to visualize
- update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]
Updates the visualization with every call.
- Parameters:
task – Needed by panda3d task manager.
- Return Task.cont:
indicates that task should be called again next frame.
- class QQubeVis(env: SimEnv, rendering: bool)[source]
Bases:
PandaVis
Visualisation for the QQubeSim class using panda3d
Constructor
- Parameters:
env – environment to visualize
- update(task: <module 'direct.task.Task' from '/home/user/miniconda3/envs/pyrado/lib/python3.7/site-packages/direct/task/Task.py'>)[source]
Updates the visualization with every call.
- Parameters:
task – Needed by panda3d task manager.
- Return Task.cont:
indicates that task should be called again next frame.
pendulum
- class PendulumSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None, init_state: Optional[ndarray] = None)[source]
Bases:
SimPyEnv
,Serializable
Under-actuated inverted pendulum environment similar to the one from OpenAI Gym
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
init_state – set an pole angle and pole angular velocity for the SingularStateSpace
- classmethod get_nominal_domain_param() dict [source]
Get the nominal a.k.a. default domain parameters.
Note
This function is used to check which domain parameters exist.
- name: str = 'pend'
quanser_ball_balancer
- class QBallBalancerKin(qbb, num_opt_iter=100, render_mode=RenderMode(text=False, video=False, render=False))[source]
Bases:
Serializable
Calculates and visualizes the kinematics from the servo shaft angles (th_x, th_x) to the plate angles (a, b).
Constructor
- Parameters:
qbb – QBallBalancerSim object
num_opt_iter – number of optimizer iterations for the IK
mode – the render mode: a for animating (pyplot), or `` for no animation
- plate_ang(tip)[source]
Compute plate angle (alpha or beta) from the rod tip position which has been calculated from servo shaft angle (th_x or th_y) before. :return tip: 2D position of the rod tip in the sagittal plane (from the optimizer)
- class QBallBalancerSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None, simple_dynamics: bool = False, load_experimental_tholds: bool = True)[source]
Bases:
SimPyEnv
,Serializable
Environment in which a ball rolls on an actuated plate. The ball is randomly initialized on the plate and is to be stabilized on the center of the plate. The problem formulation treats this setup as 2 independent ball-on-beam problems. The plate is actuated via 2 servo motors that lift the plate.
Note
The dynamics are not the same as in the Quanser Workbook (2 DoF Ball-Balancer - Instructor). Here, we added the coriolis forces and linear-viscous friction. However, the 2 dim system is still modeled to be decoupled. This is the case, since the two rods (connected to the servos) are pushing the plate at the center lines. As a result, the angles alpha and beta are w.r.t. to the inertial frame, i.e. they are not 2 sequential rations.
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
simple_dynamics – if `True, use a dynamics model without Coriolis forces and without friction effects
load_experimental_tholds – use the voltage thresholds determined from experiments
- classmethod get_nominal_domain_param() dict [source]
Get the nominal a.k.a. default domain parameters.
Note
This function is used to check which domain parameters exist.
- classmethod get_voltage_tholds(load_experiments: bool = True) dict [source]
If available, the voltage thresholds computed from measurements, else use default values.
- measured_tholds = {'voltage_thold_x_neg': -0.1, 'voltage_thold_x_pos': 0.28, 'voltage_thold_y_neg': -0.074, 'voltage_thold_y_pos': 0.28}
- name: str = 'qbb'
- reset(init_state: Optional[ndarray] = None, domain_param: Optional[dict] = None) ndarray [source]
Reset the environment to its initial state and optionally set different domain parameters.
- Parameters:
init_state – set explicit initial state if not None. Must match init_space if any.
domain_param – set explicit domain parameters if not None
- Return obs:
initial observation of the state.
quanser_cartpole
- class QCartPoleSim(dt: float, max_steps: int, task_args: Optional[dict], long: bool, simple_dynamics: bool, wild_init: bool)[source]
Bases:
SimPyEnv
,Serializable
Base Environment for the Quanser Cart-Pole swing-up and stabilization task
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
long – set to True if using the long pole, else False
simple_dynamics – if True, use the simpler dynamics model from Quanser. If `False, use a dynamics model which includes friction
wild_init – if True the init state space is increased drastically, e.g. the initial pendulum angle can be in \([-\pi, +\pi]\). Only applicable to QCartPoleSwingUpSim.
- classmethod get_nominal_domain_param(long: bool = False) dict [source]
Get the nominal a.k.a. default domain parameters.
Note
This function is used to check which domain parameters exist.
- observe(state) ndarray [source]
Compute the (noise-free) observation from the current state.
Note
This method should be overwritten if the environment has a distinct observation space.
- Parameters:
state – current state of the environment
- Returns:
observation perceived to the agent
- reset(init_state: Optional[ndarray] = None, domain_param: Optional[dict] = None) ndarray [source]
Reset the environment to its initial state and optionally set different domain parameters.
- Parameters:
init_state – set explicit initial state if not None. Must match init_space if any.
domain_param – set explicit domain parameters if not None
- Return obs:
initial observation of the state.
- class QCartPoleStabSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None, long: bool = True, simple_dynamics: bool = True)[source]
Bases:
QCartPoleSim
,Serializable
Environment in which a pole has to be stabilized in the upright position (inverted pendulum) by moving a cart on a rail. The pole can rotate around an axis perpendicular to direction in which the cart moves.
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
long – set to True if using the long pole, else False
simple_dynamics – if True, use the simpler dynamics model from Quanser. If `False, use a dynamics model which includes friction
- name: str = 'qcp-st'
- class QCartPoleSwingUpSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None, long: bool = False, simple_dynamics: bool = False, wild_init: bool = True)[source]
Bases:
QCartPoleSim
,Serializable
Environment in which a pole has to be swung up and stabilized in the upright position (inverted pendulum) by moving a cart on a rail. The pole can rotate around an axis perpendicular to direction in which the cart moves.
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
long – set to True if using the long pole, else False
simple_dynamics – if True, use the simpler dynamics model from Quanser. If False, use a dynamics model which includes friction
wild_init – if True the init state space is increased drastically, e.g. the initial pendulum angle can be in \([-\pi, +\pi]\)
- name: str = 'qcp-su'
quanser_qube
- class QQubeSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]
Bases:
SimPyEnv
,Serializable
Base Environment for the Quanser Qube swing-up and stabilization task
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
- class QQubeStabSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]
Bases:
QQubeSim
Environment which models Quanser’s Furuta pendulum called Quanser Qube. The goal is to stabilize the pendulum at the upright position (alpha = +-pi).
Note
This environment is only for testing purposes, or to find the PD gains for stabilizing the pendulum at the top.
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
- name: str = 'qq-st'
- class QQubeSwingUpSim(dt: float, max_steps: int = inf, task_args: Optional[dict] = None)[source]
Bases:
QQubeSim
Environment which models Quanser’s Furuta pendulum called Quanser Qube. The goal is to swing up the pendulum and stabilize at the upright position (alpha = +-pi).
Constructor
- Parameters:
dt – simulation step size [s]
max_steps – maximum number of simulation steps
task_args – arguments for the task construction
- name: str = 'qq-su'