regression
nonlin_regression
- class NonlinRegression(save_dir: ~pyrado.PathLike, inputs: ~torch.Tensor, targets: ~torch.Tensor, policy: ~pyrado.policies.base.Policy, max_iter: int, max_iter_no_improvement: int = 30, optim_class=<class 'torch.optim.adam.Adam'>, optim_hparam: ~typing.Optional[dict] = None, loss_fcn=MSELoss(), batch_size: int = 256, ratio_train: float = 0.8, max_grad_norm: ~typing.Optional[float] = None, lr_scheduler=None, lr_scheduler_hparam: ~typing.Optional[dict] = None, logger: ~typing.Optional[~pyrado.logger.step.StepLogger] = None)[source]
Bases:
Algorithm
Train a policy using stochastic gradient descent to approximate the given data.
Constructor
- Parameters:
save_dir – directory to save the snapshots i.e. the results in
inputs – input data set, where the samples are along the first dimension
targets – target data set, where the samples are along the first dimension
policy – Pyrado policy (subclass of PyTorch’s Module) to train
max_iter – maximum number of iterations
max_iter_no_improvement – if the performance on the validation set did not improve for this many iterations, the policy is considered to have converged, i.e. training stops
optim_class – PyTorch optimizer class
optim_hparam – hyper-parameters for the PyTorch optimizer
loss_fcn – loss function for training, by default torch.nn.MSELoss()
batch_size – number of samples per policy update batch
ratio_train – ratio of the training samples w.r.t. the total sample count
max_grad_norm – maximum L2 norm of the gradients for clipping, set to None to disable gradient clipping
lr_scheduler – learning rate scheduler that does one step per epoch (pass through the whole data set)
lr_scheduler_hparam – hyper-parameters for the learning rate scheduler
logger – logger for every step of the algorithm, if None the default logger will be created
- name: str = 'regr'
- reset(seed: Optional[int] = None)[source]
Reset the algorithm to it’s initial state. This should NOT reset learned policy parameters. By default, this resets the iteration count and the exploration strategy. Be sure to call this function if you override it.
- Parameters:
seed – seed value for the random number generators, pass None for no seeding
- save_snapshot(meta_info: Optional[dict] = None)[source]
Save the algorithm information (e.g., environment, policy, ect.). Subclasses should call the base method to save the policy.
- Parameters:
meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm
- step(snapshot_mode: str, meta_info: Optional[dict] = None)[source]
Perform a single iteration of the algorithm. This includes collecting the data, updating the parameters, and adding the metrics of interest to the logger. Does not update the curr_iter attribute.
- Parameters:
snapshot_mode – determines when the snapshots are stored (e.g. on every iteration or on new highscore)
meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm
timeseries_prediction
- class TSPred(save_dir: ~pyrado.PathLike, dataset: ~pyrado.utils.data_sets.TimeSeriesDataSet, policy: ~pyrado.policies.base.Policy, max_iter: int, windowed: bool = False, cascaded: bool = False, optim_class=<class 'torch.optim.adam.Adam'>, optim_hparam: ~typing.Optional[dict] = None, loss_fcn=MSELoss(), lr_scheduler=None, lr_scheduler_hparam: ~typing.Optional[dict] = None, logger: ~typing.Optional[~pyrado.logger.step.StepLogger] = None)[source]
Bases:
Algorithm
Train a policy to predict a time series of data.
Constructor
- Parameters:
save_dir – directory to save the snapshots i.e. the results in
dataset – complete data set, where the samples are along the first dimension
policy – Pyrado policy (subclass of PyTorch’s Module) to train
max_iter – maximum number of iterations
windowed – if True, one fixed-length (short) input sequence is provided to the policy which then predicts one sample, else the complete (long) input sequence is fed to the policy which then predicts an sequence of samples of equal length
cascaded – it True, the predictions are made based on previous predictions instead of the current input
optim_class – PyTorch optimizer class
optim_hparam – hyper-parameters for the PyTorch optimizer
loss_fcn – loss function for training, by default torch.nn.MSELoss()
lr_scheduler – learning rate scheduler that does one step per epoch (pass through the whole data set)
lr_scheduler_hparam – hyper-parameters for the learning rate scheduler
logger – logger for every step of the algorithm, if None the default logger will be created
- static evaluate(policy: Policy, inps: Tensor, targs: Tensor, windowed: bool, cascaded: bool, num_init_samples: int, hidden: Optional[Tensor] = None, loss_fcn=MSELoss(), verbose: bool = True)[source]
- load_snapshot(parsed_args) Tuple[Env, Policy, dict] [source]
Load the state of an experiment, which is specific to the algorithm.
- Parameters:
parsed_args – arguments parsed by the argparser
- Returns:
environment, policy, and (optional) algorithm-specific output, e.g. value function
- name: str = 'tspred'
- static predict(policy: ~pyrado.policies.base.Policy, inp_seq: ~torch.Tensor, windowed: bool, cascaded: bool, hidden: ~typing.Optional[~torch.Tensor] = None) -> (<class 'torch.Tensor'>, <class 'torch.Tensor'>)[source]
Reset the hidden states, predict one output given a arbitrary long sequence of inputs.
- Parameters:
policy – policy used to make the predictions
inp_seq – input sequence
hidden – initial hidden states, pass None to let the network pick its default hidden state
windowed – if True, one fixed-length (short) input sequence is provided to the policy which then predicts one sample, else the complete (long) input sequence is fed to the policy which then predicts an sequence of samples of equal length
cascaded – it True, the predictions are made based on previous predictions instead of the current input
- Returns:
predicted output and latest hidden state
- save_snapshot(meta_info: Optional[dict] = None)[source]
Save the algorithm information (e.g., environment, policy, ect.). Subclasses should call the base method to save the policy.
- Parameters:
meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm
- step(snapshot_mode: str, meta_info: Optional[dict] = None)[source]
Perform a single iteration of the algorithm. This includes collecting the data, updating the parameters, and adding the metrics of interest to the logger. Does not update the curr_iter attribute.
- Parameters:
snapshot_mode – determines when the snapshots are stored (e.g. on every iteration or on new highscore)
meta_info – is not None if this algorithm is run as a subroutine of a meta-algorithm, contains a dict of information about the current iteration of the meta-algorithm