Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

@tf.function ValueError: Creating variables on a non-first call to a function decorated with tf.function, unable to understand behaviour

I would like to know why this function:

@tf.function
def train(self,TargetNet,epsilon):
    if len(self.experience['s']) < self.min_experiences:
        return 0
    ids=np.random.randint(low=0,high=len(self.replay_buffer['s']),size=self.batch_size)
    states=np.asarray([self.experience['s'][i] for i in ids])
    actions=np.asarray([self.experience['a'][i] for i in ids])
    rewards=np.asarray([self.experience['r'][i] for i in ids])
    next_states=np.asarray([self.experience['s1'][i] for i in ids])
    dones = np.asarray([self.experience['done'][i] for i in ids])
    q_next_actions=self.get_action(next_states,epsilon)
    q_value_next=TargetNet.predict(next_states)
    q_value_next=tf.gather_nd(q_value_next,tf.stack((tf.range(self.batch_size),q_next_actions),axis=1))
    targets=tf.where(dones, rewards, rewards+self.gamma*q_value_next)

    with tf.GradientTape() as tape:
        estimates=tf.math.reduce_sum(self.predict(states)*tf.one_hot(actions,self.num_actions),axis=1)
        loss=tf.math.reduce_sum(tf.square(estimates - targets))
    variables=self.model.trainable_variables
    gradients=tape.gradient(loss,variables)
    self.optimizer.apply_gradients(zip(gradients,variables))

gives ValueError: Creating variables on a non-first call to a function decorated with tf.function. Whereas this code which is very similiar:

@tf.function
def train(self, TargetNet):
    if len(self.experience['s']) < self.min_experiences:
        return 0
    ids = np.random.randint(low=0, high=len(self.experience['s']), size=self.batch_size)
    states = np.asarray([self.experience['s'][i] for i in ids])
    actions = np.asarray([self.experience['a'][i] for i in ids])
    rewards = np.asarray([self.experience['r'][i] for i in ids])
    states_next = np.asarray([self.experience['s2'][i] for i in ids])
    dones = np.asarray([self.experience['done'][i] for i in ids])
    value_next = np.max(TargetNet.predict(states_next), axis=1)
    actual_values = np.where(dones, rewards, rewards+self.gamma*value_next)

    with tf.GradientTape() as tape:
        selected_action_values = tf.math.reduce_sum(
            self.predict(states) * tf.one_hot(actions, self.num_actions), axis=1)
        loss = tf.math.reduce_sum(tf.square(actual_values - selected_action_values))
    variables = self.model.trainable_variables
    gradients = tape.gradient(loss, variables)
    self.optimizer.apply_gradients(zip(gradients, variables))

Does not throw an error.Please help me understand why.

EDIT:I removed the parameter epsilon from the function and it works.Is it because the @tf.function decorator is valid only for single argument functions?

like image 438
drongo Avatar asked Jan 19 '20 15:01

drongo


1 Answers

Using tf.function you're converting the content of the decorated function: this means that TensorFlow will try to compile your eager code into its graph representation.

The variables, however, are special objects. In fact, when you were using TensorFlow 1.x (graph mode), you were defining the variables only once and then using/updating them.

In tensorflow 2.0, if you use pure eager execution, you can declare and re-use the same variable more than once since a tf.Variable - in eager mode - is just a plain Python object that gets destroyed as soon as the function ends and the variable, thus, goes out of scope.

In order to make TensorFlow able to correctly convert a function that creates a state (thus, that uses Variables) you have to break the function scope, declaring the variables outside of the function.

In short, if you have a function that works correctly in eager mode, like:

def f():
    a = tf.constant([[10,10],[11.,1.]])
    x = tf.constant([[1.,0.],[0.,1.]])
    b = tf.Variable(12.)
    y = tf.matmul(a, x) + b
    return y

You have to change it's structure to something like:

b = None

@tf.function
def f():
    a = tf.constant([[10, 10], [11., 1.]])
    x = tf.constant([[1., 0.], [0., 1.]])
    global b
    if b is None:
        b = tf.Variable(12.)
    y = tf.matmul(a, x) + b
    print("PRINT: ", y)
    tf.print("TF-PRINT: ", y)
    return y

f()

in order to make it work correctly with the tf.function decorator.

I covered this (and others) scenario in several blog posts: the first part analyzes this behavior in the section Handling states breaking the function scope (however I suggest to read it from the beginning and to read also part 2 and 3).

like image 160
nessuno Avatar answered Oct 26 '22 05:10

nessuno