Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Learning rate of custom training loop for tensorflow 2.0

Are there any functions or methods which can show the learning rate when I use the tensorflow 2.0 custom training loop?

Here is an example of tensorflow guide:

def train_step(images, labels):
  with tf.GradientTape() as tape:
    predictions = model(images)
    loss = loss_object(labels, predictions)
  gradients = tape.gradient(loss, model.trainable_variables)
  optimizer.apply_gradients(zip(gradients, model.trainable_variables))

  train_loss(loss)
  train_accuracy(labels, predictions)

How can I retrieve the current learning rate from the optimizer when the model is training?

I will be grateful for any help you can provide. :)

like image 868
yun Avatar asked Sep 28 '19 19:09

yun


People also ask

How long does it take to train a TensorFlow model?

Training usually takes between 2-8 hours depending on the number of files and queued models for training.

How do I find the best learning rate in TensorFlow?

The technique can be described as follows: Start with a very low learning rate e.g. 1-e7. After each batch, increase the learning rate and record the loss and learning rate. Stop when a very high learning rate (10+) is reached, or the loss value explodes.

What is learning rate in TensorFlow?

The learning rate is a hyperparameter that controls how much to change the model in response to the estimated error each time the model weights are updated.


Video Answer


2 Answers

In Tensorflow 2.1, the Optimizer class has an undocumented method _decayed_lr (see definition here), which you can invoke in the training loop by supplying the variable type to cast to:

current_learning_rate = optimizer._decayed_lr(tf.float32)

Here's a more complete example with TensorBoard too.

train_step_count = 0
summary_writer = tf.summary.create_file_writer('logs/')
def train_step(images, labels):
  train_step_count += 1
  with tf.GradientTape() as tape:
    predictions = model(images)
    loss = loss_object(labels, predictions)
  gradients = tape.gradient(loss, model.trainable_variables)
  optimizer.apply_gradients(zip(gradients, model.trainable_variables))

  # optimizer._decayed_lr(tf.float32) is the current Learning Rate.
  # You can save it to TensorBoard like so:
  with summary_writer.as_default():
    tf.summary.scalar('learning_rate',
                      optimizer._decayed_lr(tf.float32),
                      step=train_step_count)
like image 78
P Shved Avatar answered Sep 27 '22 22:09

P Shved


In custom training loop setting, you can print(optimizer.lr.numpy()) to get the learning rate.

If you are using keras api, you can define your own callback that records the current learning rate.

from tensorflow.keras.callbacks import Callback

class LRRecorder(Callback):
    """Record current learning rate. """
    def on_epoch_begin(self, epoch, logs=None):
        lr = self.model.optimizer.lr
        print("The current learning rate is {}".format(lr.numpy()))

# your other callbacks 
callbacks.append(LRRecorder())

Update

w := w - (base_lr*m/sqrt(v))*grad = w - act_lr*grad The learning rate we get above is the base_lr. However, act_lr is adaptive changed during training. Take Adam optimizer as an example, act_lr is determined by base_lr, m and v. m and v are the first and second momentums of parameters. Different parameters have different m and v values. So if you would like to know the act_lr, you need to know the variable's name. For example, you want to know the act_lr of the variable Adam/dense/kernel, you can access the m and v like this,

for var in optimizer.variables():
  if 'Adam/dense/kernel/m' in var.name:
    print(var.name, var.numpy())

  if 'Adam/dense/kernel/v' in var.name:
    print(var.name, var.numpy())

Then you can easily calculate the act_lr using above formula.

like image 39
zihaozhihao Avatar answered Sep 27 '22 22:09

zihaozhihao