Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow restore while ignoring scope name or into new scope name

I had trained network N first and saved it with the saver into checkpoint Checkpoint_N. There were some variable scopes defined within N.

Now, I want to build a siamese network using this trained network N as below:

with tf.variable_scope('siameseN',reuse=False) as scope:
  networkN = N()
  embedding_1 = networkN.buildN() 
  # this defines the network graph and all the variables.
  tf.train.Saver().restore(session_variable,Checkpoint_N)
  scope.reuse_variables()
  embedding_2 = networkN.buildN()
  # define 2nd branch of the Siamese, by reusing previously restored variables.

When I do the above, the restore statement throws a Key Error that siameseN/conv1 was not found in the checkpoint file for every variable in N's graph.

Is there a way to do this, without changing the code of N? I just basically added a parent scope to every variable and operation in N. Can I restore the weights to the right variables by telling tensorflow to ignore the parent scope or something?

like image 841
sanjeev mk Avatar asked Feb 25 '18 13:02

sanjeev mk


Video Answer


2 Answers

This is related to: How to restore weights with different names but same shapes Tensorflow?

tf.train.Saver(var_list={'variable_name_in_checkpoint':var_to_be_restored_to,...'})

can take list of variables to restore or dictionary

(e.g. 'variable_name_in_checkpoint':var_to_be_restored_to,...)

You can prepare the above dictionary by going through all variables in current session variables, and use the session variable as value and getting the name of current variable, and remove 'siameseN/' from variable name and use as key. It should theoretically work.

like image 65
Dinesh Avatar answered Oct 15 '22 11:10

Dinesh


I had to change the code a bit, to write my own restore function. I decided to load the checkpoint file as a dictionary with variable names as keys and the corresponding numpy array as values as below:

checkpoint_path = '/path/to/checkpoint'
from tensorflow.python import pywrap_tensorflow

reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
var_to_shape_map = reader.get_variable_to_shape_map()

key_to_numpy = {}
for key in var_to_shape_map:
  key_to_numpy[key] = reader.get_tensor(key)

I already had this single function where all variables are created, and which is called from the graph N with the name required. I modified it to initialize the variables using the numpy array obtained from the dictionary lookup. And, for the lookup to be successful, I just stripped off the parent name scope I added, as below:

init = tf.constant(key_to_numpy[ name.split('siameseN/')[1] ])
var = tf.get_variable(name,  initializer=init)
#var = tf.get_variable(name, shape, initializer=initializer)
return var

This is a much hackier way to do this. I didn't use the answer by @edit , because I had already written this above code. Additionally, all my weights are created in one function which assigns those weights to a variable var and returns it. Because this is akin to functional programming, the variable var keeps getting overwritten. var is never exposed to higher level functions. To use @edit's answer, I'd have to use a different tensor variable name for every initialization and expose those to higher level functions somehow, so that the saver can use them as var_to_be_restored_to in their answer.

But @edit's solution is the less hacky solution, as it adheres to the documented usage. So I'll accept that answer. What I did can be an alternative solution.

like image 22
sanjeev mk Avatar answered Oct 15 '22 10:10

sanjeev mk