Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reuse tensorflow variables in eager execution mode?

When calling get_variable() function in tensorflow, the behavior of the "reuse" flag is defined in the tensorflow api doc to be AUTO_REUSE:

reuse: True, None, or tf.AUTO_REUSE; ... When eager execution is enabled, this argument is always forced to be tf.AUTO_REUSE.

However when I really run the demo code as suggested in the webpage:

tf.enable_eager_execution()
def foo():
  with tf.variable_scope("foo", reuse=tf.AUTO_REUSE):
    v = tf.get_variable("v", [1])
  return v
v1 = foo()  # Creates v.
v2 = foo()  # Gets the same, existing v.
assert v1 == v2

It fails. (It passes if the first line is removed, as expected.)

So how to reuse a variable in eager mode? Is this a bug or I'm missing anything?

like image 870
David M Avatar asked Jun 14 '18 08:06

David M


Video Answer


2 Answers

In eager mode, things are simpler... except for people that have been brain damaged (like me) by using graphs models for too long.

Eager works in a standard fashion, where variables last only while they are referenced. If you stop referencing them, they are gone.

To do variable sharing, you do the same thing you would naturally do if you were to use numpy (or really anything else) to do the computation: you store variables in an object, and you reuse this object.

This is the reason why eager has so much affinity with the keras API, because keras deals mostly with objects.

So look again at your functions in terms of numpy for example (useful for those like me recovering from graphs). Would you expect two calls to foo to return the same array object? Of course not.

like image 149
P-Gn Avatar answered Oct 22 '22 10:10

P-Gn


I found it easiest to reuse variables in Eager Execution by simply passing a reference to the same variable around:

import tensorflow as tf
tf.enable_eager_execution()
import numpy as np

class MyLayer(tf.keras.layers.Layer):
    def __init__(self):
        super(MyLayer, self).__init__()

    def build(self, input_shape):
        # bias specific for each layer
        self.B = self.add_variable('B', [1])

    def call(self, input, A):
        # some function involving input, common weights, and layer-specific bias
        return tf.matmul(input, A) + self.B

class MyModel(tf.keras.Model):    
    def __init__(self):
        super(MyModel, self).__init__()

    def build(self, input_shape):
        # common vector of weights
        self.A = self.add_variable('A', [int(input_shape[-1]), 1])

        # layers which will share A
        self.layer1 = MyLayer()
        self.layer2 = MyLayer()

    def call(self, input):
        result1 = self.layer1(input, self.A)
        result2 = self.layer2(input, self.A)
        return result1 + result2

if __name__ == "__main__":
    data = np.random.normal(size=(1000, 3))
    model = MyModel()
    predictions = model(data)
    print('\n\n')
    model.summary()
    print('\n\n')
    print([v.name for v in model.trainable_variables])

The output is:

enter image description here

Thus, we have a shared weight parameter my_model/A of dimension 3 and two bias parameters my_model/my_layer/B and my_model/my_layer_1/B of dimension 1 each, for a total of 5 trainable parameters. The code runs on its own so feel free to play around with it.

like image 42
Vivek Subramanian Avatar answered Oct 22 '22 10:10

Vivek Subramanian