Is there a momentum option for Adam optimizer in Keras? [closed]

1 Answers

Short answer: no, neither in Keras nor in Tensorflow [EDIT: see UPDATE at the end]

Long answer: as already mentioned in the comments, Adam already incorporates something like momentum. Here is some relevant corroboration:

From the highly recommended An overview of gradient descent optimization algorithms (available also as a paper):

In addition to storing an exponentially decaying average of past squared gradients u[t] like Adadelta and RMSprop, Adam also keeps an exponentially decaying average of past gradients m[t], similar to momentum

From Stanford CS231n: CNNs for Visual Recognition:

Adam is a recently proposed update that looks a bit like RMSProp with momentum

Notice that some frameworks actually include a momentum parameter for Adam, but this is actually the beta1 parameter; here is CNTK:

momentum (float, list, output of momentum_schedule()) – momentum schedule. Note that this is the beta1 parameter in the Adam paper. For additional information, please refer to the this CNTK Wiki article.

That said, there is an ICLR 2016 paper titled Incorporating Nesterov momentum into Adam, along with an implementation skeleton in Tensorflow by the author - cannot offer any opinion on this, though.

UPDATE: Keras indeed includes now an optimizer called Nadam, based on the ICLR 2016 paper mentioned above; from the docs:

Much like Adam is essentially RMSprop with momentum, Nadam is Adam RMSprop with Nesterov momentum.

It is also included in Tensorflow as a contributed module NadamOptimizer.

193

answered Sep 18 '22 11:09

desertnaut

Related questions
                            
                                Is SIMD Worth It? Is there a better option?
                            
                                android png optimization
                            
                                how to optimize a left join query?
                            
                                Making Zend-Framework run faster
                            
                                Random memory accesses are expensive?
                            
                                Compiler optimization about elimination of pointer operation on inline function in C?
                            
                                C Preprocessor getting rid of the __align__ and __attribute__
                            
                                What is a more efficient / tidier way of breaking this string down?
                            
                                Slow PHP script - automatic debug and diagnosis?
                            
                                Fast inverse square root on the iPhone
                            
                                How to substitute `find` commands with `logical indexing` (MATLAB), for looking up vector value positions of unique values?
                            
                                Can I vectorize code when data is in a list?
                            
                                Solving knapsack prob in F#: performance
                            
                                Need help for search optimization
                            
                                Optimization of virtual function calls in derived class
                            
                                PHP Laravel Facade __callStatic argument list
                            
                                Why are inline if statements an average of at least one-third slower than other types of if?
                            
                                C gives different output based on optimization level (new example)
                            
                                Summing Clojure Ratios is slow
                            
                                HBase BufferedMutator vs PutList performance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there a momentum option for Adam optimizer in Keras? [closed]

Tags:

optimization

machine-learning

neural-network

deep-learning

keras

Tuan Do

People also ask

1 Answers

desertnaut

Recent Activity

Donate For Us