regression

nonlin_regression

class NonlinRegression(save_dir: ~pyrado.PathLike, inputs: ~torch.Tensor, targets: ~torch.Tensor, policy: ~pyrado.policies.base.Policy, max_iter: int, max_iter_no_improvement: int = 30, optim_class=<class 'torch.optim.adam.Adam'>, optim_hparam: ~typing.Optional[dict] = None, loss_fcn=MSELoss(), batch_size: int = 256, ratio_train: float = 0.8, max_grad_norm: ~typing.Optional[float] = None, lr_scheduler=None, lr_scheduler_hparam: ~typing.Optional[dict] = None, logger: ~typing.Optional[~pyrado.logger.step.StepLogger] = None)[source]

Bases: Algorithm

Train a policy using stochastic gradient descent to approximate the given data.

Constructor

Parameters:
  • save_dir – directory to save the snapshots i.e. the results in

  • inputs – input data set, where the samples are along the first dimension

  • targets – target data set, where the samples are along the first dimension

  • policy – Pyrado policy (subclass of PyTorch’s Module) to train

  • max_iter – maximum number of iterations

  • max_iter_no_improvement – if the performance on the validation set did not improve for this many iterations, the policy is considered to have converged, i.e. training stops

  • optim_class – PyTorch optimizer class

  • optim_hparam – hyper-parameters for the PyTorch optimizer

  • loss_fcn – loss function for training, by default torch.nn.MSELoss()

  • batch_size – number of samples per policy update batch

  • ratio_train – ratio of the training samples w.r.t. the total sample count

  • max_grad_norm – maximum L2 norm of the gradients for clipping, set to None to disable gradient clipping

  • lr_scheduler – learning rate scheduler that does one step per epoch (pass through the whole data set)

  • lr_scheduler_hparam – hyper-parameters for the learning rate scheduler

  • logger – logger for every step of the algorithm, if None the default logger will be created

name: str = 'regr'
reset(seed: Optional[int] = None)[source]

Reset the algorithm to it’s initial state. This should NOT reset learned policy parameters. By default, this resets the iteration count and the exploration strategy. Be sure to call this function if you override it.

Parameters:

seed – seed value for the random number generators, pass None for no seeding

save_snapshot(meta_info: Optional[dict] = None)[source]

Save the algorithm information (e.g., environment, policy, ect.). Subclasses should call the base method to save the policy.

Parameters:

meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm

step(snapshot_mode: str, meta_info: Optional[dict] = None)[source]

Perform a single iteration of the algorithm. This includes collecting the data, updating the parameters, and adding the metrics of interest to the logger. Does not update the curr_iter attribute.

Parameters:
  • snapshot_mode – determines when the snapshots are stored (e.g. on every iteration or on new highscore)

  • meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm

timeseries_prediction

class TSPred(save_dir: ~pyrado.PathLike, dataset: ~pyrado.utils.data_sets.TimeSeriesDataSet, policy: ~pyrado.policies.base.Policy, max_iter: int, windowed: bool = False, cascaded: bool = False, optim_class=<class 'torch.optim.adam.Adam'>, optim_hparam: ~typing.Optional[dict] = None, loss_fcn=MSELoss(), lr_scheduler=None, lr_scheduler_hparam: ~typing.Optional[dict] = None, logger: ~typing.Optional[~pyrado.logger.step.StepLogger] = None)[source]

Bases: Algorithm

Train a policy to predict a time series of data.

Constructor

Parameters:
  • save_dir – directory to save the snapshots i.e. the results in

  • dataset – complete data set, where the samples are along the first dimension

  • policy – Pyrado policy (subclass of PyTorch’s Module) to train

  • max_iter – maximum number of iterations

  • windowed – if True, one fixed-length (short) input sequence is provided to the policy which then predicts one sample, else the complete (long) input sequence is fed to the policy which then predicts an sequence of samples of equal length

  • cascaded – it True, the predictions are made based on previous predictions instead of the current input

  • optim_class – PyTorch optimizer class

  • optim_hparam – hyper-parameters for the PyTorch optimizer

  • loss_fcn – loss function for training, by default torch.nn.MSELoss()

  • lr_scheduler – learning rate scheduler that does one step per epoch (pass through the whole data set)

  • lr_scheduler_hparam – hyper-parameters for the learning rate scheduler

  • logger – logger for every step of the algorithm, if None the default logger will be created

static evaluate(policy: Policy, inps: Tensor, targs: Tensor, windowed: bool, cascaded: bool, num_init_samples: int, hidden: Optional[Tensor] = None, loss_fcn=MSELoss(), verbose: bool = True)[source]
load_snapshot(parsed_args) Tuple[Env, Policy, dict][source]

Load the state of an experiment, which is specific to the algorithm.

Parameters:

parsed_args – arguments parsed by the argparser

Returns:

environment, policy, and (optional) algorithm-specific output, e.g. value function

name: str = 'tspred'
static predict(policy: ~pyrado.policies.base.Policy, inp_seq: ~torch.Tensor, windowed: bool, cascaded: bool, hidden: ~typing.Optional[~torch.Tensor] = None) -> (<class 'torch.Tensor'>, <class 'torch.Tensor'>)[source]

Reset the hidden states, predict one output given a arbitrary long sequence of inputs.

Parameters:
  • policy – policy used to make the predictions

  • inp_seq – input sequence

  • hidden – initial hidden states, pass None to let the network pick its default hidden state

  • windowed – if True, one fixed-length (short) input sequence is provided to the policy which then predicts one sample, else the complete (long) input sequence is fed to the policy which then predicts an sequence of samples of equal length

  • cascaded – it True, the predictions are made based on previous predictions instead of the current input

Returns:

predicted output and latest hidden state

save_snapshot(meta_info: Optional[dict] = None)[source]

Save the algorithm information (e.g., environment, policy, ect.). Subclasses should call the base method to save the policy.

Parameters:

meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm

step(snapshot_mode: str, meta_info: Optional[dict] = None)[source]

Perform a single iteration of the algorithm. This includes collecting the data, updating the parameters, and adding the metrics of interest to the logger. Does not update the curr_iter attribute.

Parameters:
  • snapshot_mode – determines when the snapshots are stored (e.g. on every iteration or on new highscore)

  • meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm

Module contents