Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set adaptive learning rate for GradientDescentOptimizer?

I am using TensorFlow to train a neural network. This is how I am initializing the GradientDescentOptimizer:

init = tf.initialize_all_variables() sess = tf.Session() sess.run(init)  mse        = tf.reduce_mean(tf.square(out - out_)) train_step = tf.train.GradientDescentOptimizer(0.3).minimize(mse) 

The thing here is that I don't know how to set an update rule for the learning rate or a decay value for that.

How can I use an adaptive learning rate here?

like image 500
Stefan Falk Avatar asked Nov 25 '15 15:11

Stefan Falk


People also ask

How do I make my learning rate adaptive?

Adaptive Learning Rates This is called an adaptive learning rate. Perhaps the simplest implementation is to make the learning rate smaller once the performance of the model plateaus, such as by decreasing the learning rate by a factor of two or an order of magnitude.

What is adaptive learning rate?

The challenge of using learning rate schedules is that their hyperparameters have to be defined in advance and they depend heavily on the type of model and problem. Another problem is that the same learning rate is applied to all parameter updates.


1 Answers

First of all, tf.train.GradientDescentOptimizer is designed to use a constant learning rate for all variables in all steps. TensorFlow also provides out-of-the-box adaptive optimizers including the tf.train.AdagradOptimizer and the tf.train.AdamOptimizer, and these can be used as drop-in replacements.

However, if you want to control the learning rate with otherwise-vanilla gradient descent, you can take advantage of the fact that the learning_rate argument to the tf.train.GradientDescentOptimizer constructor can be a Tensor object. This allows you to compute a different value for the learning rate in each step, for example:

learning_rate = tf.placeholder(tf.float32, shape=[]) # ... train_step = tf.train.GradientDescentOptimizer(     learning_rate=learning_rate).minimize(mse)  sess = tf.Session()  # Feed different values for learning rate to each training step. sess.run(train_step, feed_dict={learning_rate: 0.1}) sess.run(train_step, feed_dict={learning_rate: 0.1}) sess.run(train_step, feed_dict={learning_rate: 0.01}) sess.run(train_step, feed_dict={learning_rate: 0.01}) 

Alternatively, you could create a scalar tf.Variable that holds the learning rate, and assign it each time you want to change the learning rate.

like image 68
mrry Avatar answered Nov 09 '22 23:11

mrry