Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is `lr_policy` in Caffe?

I just try to find out how I can use Caffe. To do so, I just took a look at the different .prototxt files in the examples folder. There is one option I don't understand:

# The learning rate policy lr_policy: "inv" 

Possible values seem to be:

  • "fixed"
  • "inv"
  • "step"
  • "multistep"
  • "stepearly"
  • "poly"

Could somebody please explain those options?

like image 836
Martin Thoma Avatar asked May 04 '15 14:05

Martin Thoma


People also ask

What is deep learning solver?

A solver is a routine for finding exact numerical answers for determined systems. For example, when using Newton-Raphson to find root(s). When a system is overdetermined then one generally uses approximate solutions, for example, regression.

What is solver SGD?

SGD. Stochastic gradient descent ( type: "SGD" ) updates the weights by a linear combination of the negative gradient and the previous weight update . The learning rate is the weight of the negative gradient. The momentum is the weight of the previous update.


1 Answers

It is a common practice to decrease the learning rate (lr) as the optimization/learning process progresses. However, it is not clear how exactly the learning rate should be decreased as a function of the iteration number.

If you use DIGITS as an interface to Caffe, you will be able to visually see how the different choices affect the learning rate.

fixed: the learning rate is kept fixed throughout the learning process.


inv: the learning rate is decaying as ~1/T
enter image description here


step: the learning rate is piecewise constant, dropping every X iterations
enter image description here


multistep: piecewise constant at arbitrary intervals
enter image description here


You can see exactly how the learning rate is computed in the function SGDSolver<Dtype>::GetLearningRate (solvers/sgd_solver.cpp line ~30).


Recently, I came across an interesting and unconventional approach to learning-rate tuning: Leslie N. Smith's work "No More Pesky Learning Rate Guessing Games". In his report, Leslie suggests to use lr_policy that alternates between decreasing and increasing the learning rate. His work also suggests how to implement this policy in Caffe.

like image 82
Shai Avatar answered Nov 10 '22 00:11

Shai