Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to initialise only optimizer variables in Tensorflow?

I want to use MomentumOptimizer in Tensorflow. However, since this optimizer uses some internal variable, attempting to use it without initializing this variable yields an error:

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value Variable_2/Momentum

This can be easily solved by initializing all variables, using for example

tf.global_variables_initializer().run() 

However, I do not want to initialize all the variables - only those of optimizer. Is there any way to do this?

like image 941
Kao Avatar asked Jan 08 '17 13:01

Kao


People also ask

How do you declare a variable in TensorFlow?

In TensorFlow variables are created using the Variable() constructor. The Variable() constructor expects an initial value for the variable, which can be any kind or shape of Tensor. The type and form of the variable are defined by its initial value. The shape and the variables are fixed once they are created.

How do you initialize all variables in TensorFlow?

To initialize a new variable from the value of another variable use the other variable's initialized_value() property. You can use the initialized value directly as the initial value for the new variable, or you can use it as any other tensor to compute a value for the new variable.

How do I create a custom Optimizer in TensorFlow?

For most (custom) optimizer implementations, the method apply_gradients() needs to be adapted. This method relies on the (new) Optimizer (class), which we will create, to implement the following methods: _create_slots(), _prepare(), _apply_dense(), and _apply_sparse().

How do you initialize a TensorFlow variable in a matrix?

First, remember that you can use the TensorFlow eye functionality to easily create a square identity matrix. We create a 5x5 identity matrix with a data type of float32 and assign it to the Python variable identity matrix. So we used tf. eye, give it a size of 5, and the data type is float32.


1 Answers

Both current answers kinda work by filtering the variable name using the 'Momentum' string. But that is very brittle on two sides:

  1. It could silently (re-)initialize some other variables you don't actually want to reset! Either simply because of a name-clash, or because you have a more complex graph and optimize different parts separately, for example.
  2. It will only work for one specific optimizer, and how do you know the names to look out for for others?
  3. Bonus: an update to tensorflow might silently break your code.

Fortunately, tensorflow's abstract Optimizer class has a mechanism for that, these extra optimizer variables are called "slots", and you can get all slot names of an optimizer using the get_slot_names() method:

opt = tf.train.MomentumOptimizer(...) print(opt.get_slot_names()) # prints ['momentum'] 

And you can get the variable corresponding to the slot for a specific (trainable) variable v using the get_slot(var, slot_name) method:

opt.get_slot(some_var, 'momentum') 

Putting all this together, you can create an op that initializes the optimizer's state as follows:

var_list = # list of vars to optimize, e.g.             # tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES) opt = tf.train.MomentumOptimizer(0.1, 0.95) step_op = opt.minimize(loss, var_list=var_list) reset_opt_op = tf.variables_initializer([opt.get_slot(var, name) for name in opt.get_slot_names() for var in var_list]) 

This will really only reset the correct variables, and be robust across optimizers.

Except for one unfortunate caveat: AdamOptimizer. That one also keeps a counter for how often it's been called. That means you should really think hard about what you're doing here anyways, but for completeness' sake, you can get its extra states as opt._get_beta_accumulators(). The returned list should be added to the list in the above reset_opt_op line.

like image 50
LucasB Avatar answered Sep 30 '22 01:09

LucasB