regression

nonlin_regression

class NonlinRegression(save_dir: ~pyrado.PathLike, inputs: ~torch.Tensor, targets: ~torch.Tensor, policy: ~pyrado.policies.base.Policy, max_iter: int, max_iter_no_improvement: int = 30, optim_class=<class 'torch.optim.adam.Adam'>, optim_hparam: ~typing.Optional[dict] = None, loss_fcn=MSELoss(), batch_size: int = 256, ratio_train: float = 0.8, max_grad_norm: ~typing.Optional[float] = None, lr_scheduler=None, lr_scheduler_hparam: ~typing.Optional[dict] = None, logger: ~typing.Optional[~pyrado.logger.step.StepLogger] = None)[source]

Bases: Algorithm

Train a policy using stochastic gradient descent to approximate the given data.

Constructor

Parameters:

save_dir – directory to save the snapshots i.e. the results in
inputs – input data set, where the samples are along the first dimension
targets – target data set, where the samples are along the first dimension
policy – Pyrado policy (subclass of PyTorch’s Module) to train
max_iter – maximum number of iterations
max_iter_no_improvement – if the performance on the validation set did not improve for this many iterations, the policy is considered to have converged, i.e. training stops
optim_class – PyTorch optimizer class
optim_hparam – hyper-parameters for the PyTorch optimizer
loss_fcn – loss function for training, by default torch.nn.MSELoss()
batch_size – number of samples per policy update batch
ratio_train – ratio of the training samples w.r.t. the total sample count
max_grad_norm – maximum L2 norm of the gradients for clipping, set to None to disable gradient clipping
lr_scheduler – learning rate scheduler that does one step per epoch (pass through the whole data set)
lr_scheduler_hparam – hyper-parameters for the learning rate scheduler
logger – logger for every step of the algorithm, if None the default logger will be created

name: str = 'regr'

reset(seed: Optional[int] = None)[source]

Reset the algorithm to it’s initial state. This should NOT reset learned policy parameters. By default, this resets the iteration count and the exploration strategy. Be sure to call this function if you override it.

Parameters:: seed – seed value for the random number generators, pass None for no seeding

save_snapshot(meta_info: Optional[dict] = None)[source]

Save the algorithm information (e.g., environment, policy, ect.). Subclasses should call the base method to save the policy.

Parameters:: meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm

step(snapshot_mode: str, meta_info: Optional[dict] = None)[source]

Perform a single iteration of the algorithm. This includes collecting the data, updating the parameters, and adding the metrics of interest to the logger. Does not update the curr_iter attribute.

Parameters:

snapshot_mode – determines when the snapshots are stored (e.g. on every iteration or on new highscore)
meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm

timeseries_prediction

class TSPred(save_dir: ~pyrado.PathLike, dataset: ~pyrado.utils.data_sets.TimeSeriesDataSet, policy: ~pyrado.policies.base.Policy, max_iter: int, windowed: bool = False, cascaded: bool = False, optim_class=<class 'torch.optim.adam.Adam'>, optim_hparam: ~typing.Optional[dict] = None, loss_fcn=MSELoss(), lr_scheduler=None, lr_scheduler_hparam: ~typing.Optional[dict] = None, logger: ~typing.Optional[~pyrado.logger.step.StepLogger] = None)[source]

Bases: Algorithm

Train a policy to predict a time series of data.

Constructor

Parameters:

save_dir – directory to save the snapshots i.e. the results in
dataset – complete data set, where the samples are along the first dimension
policy – Pyrado policy (subclass of PyTorch’s Module) to train
max_iter – maximum number of iterations
windowed – if True, one fixed-length (short) input sequence is provided to the policy which then predicts one sample, else the complete (long) input sequence is fed to the policy which then predicts an sequence of samples of equal length
cascaded – it True, the predictions are made based on previous predictions instead of the current input
optim_class – PyTorch optimizer class
optim_hparam – hyper-parameters for the PyTorch optimizer
loss_fcn – loss function for training, by default torch.nn.MSELoss()
lr_scheduler – learning rate scheduler that does one step per epoch (pass through the whole data set)
lr_scheduler_hparam – hyper-parameters for the learning rate scheduler
logger – logger for every step of the algorithm, if None the default logger will be created

static evaluate(policy: Policy, inps: Tensor, targs: Tensor, windowed: bool, cascaded: bool, num_init_samples: int, hidden: Optional[Tensor] = None, loss_fcn=MSELoss(), verbose: bool = True)[source]

load_snapshot(parsed_args) → Tuple[Env, Policy, dict][source]

Load the state of an experiment, which is specific to the algorithm.

Parameters:: parsed_args – arguments parsed by the argparser
Returns:: environment, policy, and (optional) algorithm-specific output, e.g. value function

name: str = 'tspred'

static predict(policy: ~pyrado.policies.base.Policy, inp_seq: ~torch.Tensor, windowed: bool, cascaded: bool, hidden: ~typing.Optional[~torch.Tensor] = None) -> (<class 'torch.Tensor'>, <class 'torch.Tensor'>)[source]

Reset the hidden states, predict one output given a arbitrary long sequence of inputs.

Parameters:

policy – policy used to make the predictions
inp_seq – input sequence
hidden – initial hidden states, pass None to let the network pick its default hidden state
windowed – if True, one fixed-length (short) input sequence is provided to the policy which then predicts one sample, else the complete (long) input sequence is fed to the policy which then predicts an sequence of samples of equal length
cascaded – it True, the predictions are made based on previous predictions instead of the current input

Returns:

predicted output and latest hidden state

save_snapshot(meta_info: Optional[dict] = None)[source]

Save the algorithm information (e.g., environment, policy, ect.). Subclasses should call the base method to save the policy.

Parameters:: meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm

step(snapshot_mode: str, meta_info: Optional[dict] = None)[source]

Perform a single iteration of the algorithm. This includes collecting the data, updating the parameters, and adding the metrics of interest to the logger. Does not update the curr_iter attribute.

Parameters:

snapshot_mode – determines when the snapshots are stored (e.g. on every iteration or on new highscore)
meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm

regression

nonlin_regression

timeseries_prediction

Module contents