Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pytorch schedule learning rate

I am trying to re-implement one paper, which suggests to adjust the learning rate as below:

The learning rate is decreased by a factor of the regression value with patience epochs 10 on the change value of 0.0001.

Should I use the torch.optim.lr_scheduler.ReduceLROnPlateau()?

I am not sure what value should I pass to each parameter.

  1. Is the change value in the statement denotes to the parameter threshold?

  2. Is the factor in the statement denotes to the parameter factor?

like image 409
Yan-Jen Huang Avatar asked Apr 15 '26 16:04

Yan-Jen Huang


2 Answers

torch.optim.lr_scheduler.ReduceLROnPlateau is indeed what you are looking for. I summarized all of the important stuff for you.

mode=min: lr will be reduced when the quantity monitored has stopped decreasing

factor: factor by which the learning rate will be reduced

patience: number of epochs with no improvement after which learning rate will be reduced

threshold: threshold for measuring the new optimum, to only focus on significant changes (change value). Say we have threshold=0.0001, if loss is 18.0 on epoch n and loss is 17.9999 on epoch n+1 then we have met our criteria to multiply the current learning rate by the factor.

criterion = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min',
    factor=0.1, patience=10, threshold=0.0001, threshold_mode='abs')

for epoch in range(20):
    # training loop stuff
    loss = criterion(...)
    scheduler.step(loss)

You can check more details in the documentation: https://pytorch.org/docs/stable/optim.html#torch.optim.lr_scheduler.ReduceLROnPlateau

Pytorch has many ways to let you reduce the learning rate. It is quite well explained here:

https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate

@Antonino DiMaggio explained ReduceOnPlateau quite well. I just want to complement the answer to reply to the comment of @Yan-JenHuang:

Is it possible to decrease the learning_rate by minus a constant value instead by a factor?

First of all, you should be very careful to avoid negative values of lr! Second, subtracting a value of the learning rate is not common practice. But in any case...

You have first to make a custom lr scheduler (I modified the code of LambdaLR https://pytorch.org/docs/stable/_modules/torch/optim/lr_scheduler.html#LambdaLR):

torch.optim.lr_scheduler import _LRScheduler

class SubtractLR(_LRScheduler):
    def __init__(self, optimizer, lr_lambda, last_epoch=-1, min_lr=e-6):
        self.optimizer = optimizer
        self.min_lr = min_lr  # min learning rate > 0 

        if not isinstance(lr_lambda, list) and not isinstance(lr_lambda, tuple):
            self.lr_lambdas = [lr_lambda] * len(optimizer.param_groups)
        else:
            if len(lr_lambda) != len(optimizer.param_groups):
                raise ValueError("Expected {} lr_lambdas, but got {}".format(
                    len(optimizer.param_groups), len(lr_lambda)))
            self.lr_lambdas = list(lr_lambda)
        self.last_epoch = last_epoch
        super(LambdaLR, self).__init__(optimizer, last_epoch)

    def get_lr(self):
        if not self._get_lr_called_within_step:
            warnings.warn("To get the last learning rate computed by the scheduler, "
                          "please use `get_last_lr()`.")

        return [(max(base_lr - lmbda(self.last_epoch), self.min_lr)
                for lmbda, base_lr in zip(self.lr_lambdas, self.base_lrs)] # reduces the learning rate

Than you can use it in your training.

 lambda1 = lambda epoch: e-4 # constant to subtract from lr
 scheduler = SubtractLR(optimizer, lr_lambda=[lambda1])
 for epoch in range(100):
     train(...)
     validate(...)
     scheduler.step()
 lambda1 = lambda epoch: epoch * e-6 # increases the value to subtract lr proportionally to the epoch
 scheduler = SubtractLR(optimizer, lr_lambda=[lambda1])
 for epoch in range(100):
     train(...)
     validate(...)
     scheduler.step()

You can also modify the code of ReduceLROnPlateau to subtract the learning rate instead of mutiplying it. Your should change this line new_lr = max(old_lr * self.factor, self.min_lrs[i]) to something like new_lr = max(old_lr - self.factor, self.min_lrs[i]). You can take a look at the code yourself: https://pytorch.org/docs/stable/_modules/torch/optim/lr_scheduler.html#ReduceLROnPlateau

like image 33
Victor Zuanazzi Avatar answered Apr 19 '26 16:04

Victor Zuanazzi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!