Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow Adam optimizer vs Keras Adam optimizer

I originally developed a classifier in Keras, where my optimizer was very easy to apply decay to.

adam = keras.optimizers.Adam(decay=0.001)

Recently I tried to change the entire code to pure Tensorflow, and cannot figure out how to correctly apply the same decay mechanism to my optimizer.

optimizer = tf.train.AdamOptimizer()
train_op = optimizer.minimize(loss=loss,global_step=tf.train.get_global_step())

How do I apply the same learning rate decay seen in my Keras code snippet to my Tensorflow snippet?

like image 986
chattrat423 Avatar asked Jan 08 '19 19:01

chattrat423


People also ask

What is Adam Optimizer in Tensorflow?

Adam optimization is a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments.

Which Optimizer is best for CNN?

The Adam optimizer had the best accuracy of 99.2% in enhancing the CNN ability in classification and segmentation.

Is Adam the best optimizer?

Adam is the best among the adaptive optimizers in most of the cases. Good with sparse data: the adaptive learning rate is perfect for this type of datasets.

What are the different optimizers in Tensorflow?

Tensorflow Keras Optimizers Classes: Adagrad: Optimizer that implements the Adagrad algorithm. Adam: Optimizer that implements the Adam algorithm. Adamax: Optimizer that implements the Adamax algorithm. Ftrl: Optimizer that implements the FTRL algorithm.


Video Answer


1 Answers

You can find a decent documentation about decay in tensorflow:

...
global_step = tf.Variable(0, trainable=False)
starter_learning_rate = 0.1
learning_rate = tf.train.exponential_decay(starter_learning_rate, global_step,
                                           100000, 0.96, staircase=True)

learning_step = ( tf.train.GradientDescentOptimizer(learning_rate)
    .minimize(...my loss..., global_step=global_step)
)

tf.train.exponential_decay applies exponential decay to the learning rate.

Other decays:

  • inverse_time_decay
  • polynomial_decay
  • linear_cosine_decay
  • exponential_decay
  • cosine_decay
  • cosine_decay_restarts
  • natural_exp_decay
  • noisy_linear_cosine_decay

Keras implemented decay in AdamOptimizer similar to below, which is very close to inverse_time_decay in tensorflow:

lr = self.lr * (1. / (1. + self.decay * self.iterations))
like image 151
Amir Avatar answered Nov 03 '22 01:11

Amir