Tensorflow restore while ignoring scope name or into new scope name

Question

I had trained network N first and saved it with the saver into checkpoint Checkpoint_N. There were some variable scopes defined within N.

Now, I want to build a siamese network using this trained network N as below:

with tf.variable_scope('siameseN',reuse=False) as scope:
  networkN = N()
  embedding_1 = networkN.buildN() 
  # this defines the network graph and all the variables.
  tf.train.Saver().restore(session_variable,Checkpoint_N)
  scope.reuse_variables()
  embedding_2 = networkN.buildN()
  # define 2nd branch of the Siamese, by reusing previously restored variables.

When I do the above, the restore statement throws a Key Error that siameseN/conv1 was not found in the checkpoint file for every variable in N's graph.

Is there a way to do this, without changing the code of N? I just basically added a parent scope to every variable and operation in N. Can I restore the weights to the right variables by telling tensorflow to ignore the parent scope or something?

Dinesh · Accepted Answer

This is related to: How to restore weights with different names but same shapes Tensorflow?

tf.train.Saver(var_list={'variable_name_in_checkpoint':var_to_be_restored_to,...'})

can take list of variables to restore or dictionary

(e.g. 'variable_name_in_checkpoint':var_to_be_restored_to,...)

You can prepare the above dictionary by going through all variables in current session variables, and use the session variable as value and getting the name of current variable, and remove 'siameseN/' from variable name and use as key. It should theoretically work.

sanjeev mk · Answer

I had to change the code a bit, to write my own restore function. I decided to load the checkpoint file as a dictionary with variable names as keys and the corresponding numpy array as values as below:

checkpoint_path = '/path/to/checkpoint'
from tensorflow.python import pywrap_tensorflow

reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
var_to_shape_map = reader.get_variable_to_shape_map()

key_to_numpy = {}
for key in var_to_shape_map:
  key_to_numpy[key] = reader.get_tensor(key)

I already had this single function where all variables are created, and which is called from the graph N with the name required. I modified it to initialize the variables using the numpy array obtained from the dictionary lookup. And, for the lookup to be successful, I just stripped off the parent name scope I added, as below:

init = tf.constant(key_to_numpy[ name.split('siameseN/')[1] ])
var = tf.get_variable(name,  initializer=init)
#var = tf.get_variable(name, shape, initializer=initializer)
return var

This is a much hackier way to do this. I didn't use the answer by @edit , because I had already written this above code. Additionally, all my weights are created in one function which assigns those weights to a variable var and returns it. Because this is akin to functional programming, the variable var keeps getting overwritten. var is never exposed to higher level functions. To use @edit's answer, I'd have to use a different tensor variable name for every initialization and expose those to higher level functions somehow, so that the saver can use them as var_to_be_restored_to in their answer.

But @edit's solution is the less hacky solution, as it adheres to the documented usage. So I'll accept that answer. What I did can be an alternative solution.

Tensorflow restore while ignoring scope name or into new scope name

Tags:

python

tensorflow

sanjeev mk

Video Answer

2 Answers

Dinesh

sanjeev mk

Recent Activity

Donate For Us

Tensorflow restore while ignoring scope name or into new scope name

Tags:

python

tensorflow

sanjeev mk

Video Answer

2 Answers

Dinesh

sanjeev mk

Related questions

Recent Activity

Donate For Us