I want to use <code>MomentumOptimizer</code> in Tensorflow. However, since this optimizer uses some internal variable, attempting to use it without initializing this variable yields an error: <blockquote> <code>FailedPreconditionError</code> (see above for traceback): Attempting to use uninitialized value <code>Variable_2/Momentum</code> </blockquote> This can be easily solved by initializing all variables, using for example <pre class="prettyprint"><code>tf.global_variables_initializer().run() </code></pre> However, I do not want to initialize all the variables - only those of optimizer. Is there any way to do this?

Both current answers kinda work by filtering the variable name using the 'Momentum' string. But that is very brittle on two sides: <ol> <li>It could silently (re-)initialize some other variables you don't actually want to reset! Either simply because of a name-clash, or because you have a more complex graph and optimize different parts separately, for example.</li> <li>It will only work for one specific optimizer, and how do you know the names to look out for for others?</li> <li>Bonus: an update to tensorflow might silently break your code.</li> </ol> Fortunately, tensorflow's abstract <code>Optimizer</code> class has a mechanism for that, these extra optimizer variables are called "slots", and you can get all slot names of an optimizer using the <code>get_slot_names()</code> method: <pre class="prettyprint"><code>opt = tf.train.MomentumOptimizer(...) print(opt.get_slot_names()) # prints ['momentum'] </code></pre> And you can get the variable corresponding to the slot for a specific (trainable) variable <code>v</code> using the <code>get_slot(var, slot_name)</code> method: <pre class="prettyprint"><code>opt.get_slot(some_var, 'momentum') </code></pre> Putting all this together, you can create an op that initializes the optimizer's state as follows: <pre class="prettyprint"><code>var_list = # list of vars to optimize, e.g. # tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES) opt = tf.train.MomentumOptimizer(0.1, 0.95) step_op = opt.minimize(loss, var_list=var_list) reset_opt_op = tf.variables_initializer([opt.get_slot(var, name) for name in opt.get_slot_names() for var in var_list]) </code></pre> This will really only reset the correct variables, and be robust across optimizers. Except for one unfortunate caveat: <code>AdamOptimizer</code>. That one also keeps a counter for how often it's been called. That means you should really think hard about what you're doing here anyways, but for completeness' sake, you can get its extra states as <code>opt._get_beta_accumulators()</code>. The returned list should be added to the list in the above <code>reset_opt_op</code> line.

How to initialise only optimizer variables in Tensorflow?

Tags:

python

tensorflow

I want to use MomentumOptimizer in Tensorflow. However, since this optimizer uses some internal variable, attempting to use it without initializing this variable yields an error:

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value Variable_2/Momentum

This can be easily solved by initializing all variables, using for example

tf.global_variables_initializer().run()

However, I do not want to initialize all the variables - only those of optimizer. Is there any way to do this?

941

asked Jan 08 '17 13:01

Kao

1 Answers

Both current answers kinda work by filtering the variable name using the 'Momentum' string. But that is very brittle on two sides:

It could silently (re-)initialize some other variables you don't actually want to reset! Either simply because of a name-clash, or because you have a more complex graph and optimize different parts separately, for example.
It will only work for one specific optimizer, and how do you know the names to look out for for others?
Bonus: an update to tensorflow might silently break your code.

Fortunately, tensorflow's abstract Optimizer class has a mechanism for that, these extra optimizer variables are called "slots", and you can get all slot names of an optimizer using the get_slot_names() method:

opt = tf.train.MomentumOptimizer(...) print(opt.get_slot_names()) # prints ['momentum']

And you can get the variable corresponding to the slot for a specific (trainable) variable v using the get_slot(var, slot_name) method:

opt.get_slot(some_var, 'momentum')

Putting all this together, you can create an op that initializes the optimizer's state as follows:

var_list = # list of vars to optimize, e.g.             # tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES) opt = tf.train.MomentumOptimizer(0.1, 0.95) step_op = opt.minimize(loss, var_list=var_list) reset_opt_op = tf.variables_initializer([opt.get_slot(var, name) for name in opt.get_slot_names() for var in var_list])

This will really only reset the correct variables, and be robust across optimizers.

Except for one unfortunate caveat: AdamOptimizer. That one also keeps a counter for how often it's been called. That means you should really think hard about what you're doing here anyways, but for completeness' sake, you can get its extra states as opt._get_beta_accumulators(). The returned list should be added to the list in the above reset_opt_op line.

answered Sep 30 '22 01:09

LucasB

Related questions
                            
                                Detect socket hangup without sending or receiving?
                            
                                Any yaml libraries in Python that support dumping of long strings as block literals or folded blocks?
                            
                                Is it possible to show the exact position in Sublime Text 2?
                            
                                Pandas: Sorting columns by their mean value
                            
                                list comprehension replace for loop in 2D matrix
                            
                                'function' object has no attribute 'name' when registering blueprint
                            
                                `ValueError: A value in x_new is above the interpolation range.` - what other reasons than not ascending values?
                            
                                Auto __repr__ method
                            
                                Django: Overriding __init__ for Custom Forms
                            
                                Display graph without saving using pydot
                            
                                What is the simplest way to swap each pair of adjoining chars in a string with Python?
                            
                                Python vs Matlab [closed]
                            
                                How to get the difference of two querysets in Django?
                            
                                csv.write skipping lines when writing to csv
                            
                                How to get the highest element in absolute value in a numpy matrix?
                            
                                Python to automatically select serial ports (for Arduino)
                            
                                Capturing high multi-collinearity in statsmodels
                            
                                Unmelt Pandas DataFrame
                            
                                Django Rest Framework model Id field in nested relationship serializer
                            
                                Removing elements from an array that are in another array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With