In Caffe, the SGD solver has a momentum parameter (link). In TensorFlow, I see that tf.train.GradientDescentOptimizer
does not have an explicit momentum parameter. However, I can see that there is tf.train.MomentumOptimizer
optimizer. Is it the equivalent of Caffe SGD with momentum optimizer?
Momentum [1] or SGD with momentum is method which helps accelerate gradients vectors in the right directions, thus leading to faster converging. It is one of the most popular optimization algorithms and many state-of-the-art models are trained using it.
Adam uses Momentum and Adaptive Learning Rates to converge faster.
Nesterov Accelerated Gradient is a momentum-based SGD optimizer that "looks ahead" to where the parameters will be to calculate the gradient ex post rather than ex ante: v t = γ v t − 1 + η ∇ θ J ( θ − γ v t − 1 ) θ t = θ t − 1 + v t.
Yes it is. tf.train.MomentumOptimizer
= SGD + momentum
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With