Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does one set different learning rates for different layers or variables in TensorFlow?

I know that one can simply do it for all of them using something as in the tutorials:

opt = tf.train.GradientDescentOptimizer(learning_rate)

however it would be nice it one could pass a dictionary that maps the variable name to its corresponding learning rate. Is that possible?

I know that one could simply use compute_gradients() followed by apply_gradients() and do it manually but that seems silly. Is there a smarter way to assign specific learning rates to specific variables?

Is the only way to do this to create specific optimizer as in:

# Create an optimizer with the desired parameters.
opt = GradientDescentOptimizer(learning_rate=0.1)
# Add Ops to the graph to minimize a cost by updating a list of variables.
# "cost" is a Tensor, and the list of variables contains tf.Variable
# objects.
opt_op = opt.minimize(cost, var_list=<list of variables>)

and simply give the specific learning rate to each optimizer? But that would mean we have a list of optimizers and hence, we would need to apply the learning rule with sess.run to each optimizer. Right?

like image 548
Charlie Parker Avatar asked Feb 04 '26 02:02

Charlie Parker


1 Answers

As far as I can tell this is not possible. Mostly because this is not really a valid gradient descent then. There are plenty of optimizers which learn on their own variable specific scaling factors (like Adam or AdaGrad). Specyfing per-variable learning rate (constant one) would mean that you do not follow the gradient anymore, and while it makes sense for well formulated mathematically methods, simply setting them to a pre-defined values is just a heuristic, which I believe is a reason for not implementing this in core TF.

As you said - you can always do it on your own, define your own optimizer, iterate over variables between compute gradients and apply them, which would be around 3-4 lines of code (one to compute the gradients, one to iterate and add multiplication ops, and one to apply them back), and as far as I know - this is the simplest solution to achieve your goal.

like image 60
lejlot Avatar answered Feb 05 '26 23:02

lejlot



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!