I had trained network N
first and saved it with the saver into checkpoint Checkpoint_N
. There were some variable scopes defined within N
.
Now, I want to build a siamese network using this trained network N
as below:
with tf.variable_scope('siameseN',reuse=False) as scope:
networkN = N()
embedding_1 = networkN.buildN()
# this defines the network graph and all the variables.
tf.train.Saver().restore(session_variable,Checkpoint_N)
scope.reuse_variables()
embedding_2 = networkN.buildN()
# define 2nd branch of the Siamese, by reusing previously restored variables.
When I do the above, the restore statement throws a Key Error
that siameseN/conv1
was not found in the checkpoint file for every variable in N
's graph.
Is there a way to do this, without changing the code of N
? I just basically added a parent scope to every variable and operation in N
. Can I restore the weights to the right variables by telling tensorflow to ignore the parent scope or something?
This is related to: How to restore weights with different names but same shapes Tensorflow?
tf.train.Saver(var_list={'variable_name_in_checkpoint':var_to_be_restored_to,...'})
can take list of variables to restore or dictionary
(e.g. 'variable_name_in_checkpoint':var_to_be_restored_to,...)
You can prepare the above dictionary by going through all variables in current session variables, and use the session variable as value and getting the name of current variable, and remove 'siameseN/' from variable name and use as key. It should theoretically work.
I had to change the code a bit, to write my own restore function. I decided to load the checkpoint file as a dictionary with variable names as keys and the corresponding numpy array as values as below:
checkpoint_path = '/path/to/checkpoint'
from tensorflow.python import pywrap_tensorflow
reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
var_to_shape_map = reader.get_variable_to_shape_map()
key_to_numpy = {}
for key in var_to_shape_map:
key_to_numpy[key] = reader.get_tensor(key)
I already had this single function where all variables are created, and which is called from the graph N
with the name required. I modified it to initialize the variables using the numpy array obtained from the dictionary lookup. And, for the lookup to be successful, I just stripped off the parent name scope I added, as below:
init = tf.constant(key_to_numpy[ name.split('siameseN/')[1] ])
var = tf.get_variable(name, initializer=init)
#var = tf.get_variable(name, shape, initializer=initializer)
return var
This is a much hackier way to do this. I didn't use the answer by @edit , because I had already written this above code. Additionally, all my weights are created in one function which assigns those weights to a variable var
and returns it. Because this is akin to functional programming, the variable var
keeps getting overwritten. var
is never exposed to higher level functions. To use @edit's answer, I'd have to use a different tensor variable name for every initialization and expose those to higher level functions somehow, so that the saver can use them as var_to_be_restored_to
in their answer.
But @edit's solution is the less hacky solution, as it adheres to the documented usage. So I'll accept that answer. What I did can be an alternative solution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With