Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding Tensorflow control dependencies

I am trying to gain a stronger grasp of TensorFlow. I came across the concept of control dependencies. I understand that the order of ops as specified by us is not really relevant to Tensorflow during execution. In order to optimise the speed of execution TensorFlow decides its own order of calculating nodes. But we can customise order of execution by using tf.control_dependencies. I am not able to understand the use cases of the function. Can anyone direct me to some resource(other than the documentation) or explain the working of this function? An example:

tf.reset_default_graph()
x = tf.Variable(5)
y=tf.Variable(3)
assign = tf.assign(x,x+y)
z = x+assign
with tf.Session() as sess:
   sess.run(tf.global_variables_initializer())
   with tf.control_dependencies([assign]):
        z_out = sess.run(z)

print(z_out)

The output of the code is 8. So I infer that since z=x+y,the assign node has not been evaluated(right?). But doesn't this mean that the result of tensorflow may be erroneous? This means we need to create new nodes during every operation to force TensorFlow to calculate all the nodes leading up to the result. But in say training a neural network with 10000 steps if each step creates a new set of 1000 weights/parameters won't the space complexity explode?

like image 639
pranav Avatar asked Mar 11 '19 03:03

pranav


1 Answers

In the snippet you have posted, tf.control_dependencies is not having any effect. The function creates a context where new operations are created with a control dependency to the given operations, but in your code there are no new operations within the context, just evaluation of previously existing operations.

In most cases, control flow in TensorFlow is "obvious", in the sense that there is only one way to make a computation correctly. However, when stateful objects (i.e. variables) are involved, there are situations that may be ambiguous. Consider the following example:

import tensorflow as tf

v1 = tf.Variable(0)
v2 = tf.Variable(0)
upd1 = tf.assign(v1, v2 + 1)
upd2 = tf.assign(v2, v1 + 1)
init = tf.global_variables_initializer()

v1 and v2 are both variables initialized to 0 and then updated. However, each use the value of the other variable in the update. In a regular Python program things would run sequentially, so upd1 would run first (so v1 would be 1) and upd2 after (so v2 would be 2, because v1 was 1). But TensorFlow does not record the order in which operations are created, only their dependencies. So it may also happen that upd2 runs before upd1 (so v1 would be 2 and v2 would be 1) or that both update values (v2 + 1 and v1 + 1) are computed before the assignments (so both v1 and v2 would be 1 in the end). Indeed, if I run it several times:

for i in range(10):
    with tf.Session() as sess:
        sess.run(init)
        sess.run([upd1, upd2])
        print(*sess.run([v1, v2]))

I do not always get the same result (personally I get 1 1 and 2 1, although technically 1 2 would also be possible). If for example you wanted to compute the new value for v2 after v1 has been updated, you could just do the following:

import tensorflow as tf

v1 = tf.Variable(0)
v2 = tf.Variable(0)
upd1 = tf.assign(v1, v2 + 1)
upd2 = tf.assign(v2, upd1 + 1)
init = tf.global_variables_initializer()

Here the new value v2 is computed using upd1, which is guaranteed to be the value of the variable after the update. So here upd2 would have an implicit dependency to the assignment, and so things would work as expected.

But what if you wanted to always compute the new values for v1 and v2 using the non-updated variable values (that is, consistently end up with both v1 and v2 being 1)? In that case you can use tf.control_dependencies:

import tensorflow as tf

v1 = tf.Variable(0)
v2 = tf.Variable(0)
new_v1 = v2 + 1
new_v2 = v1 + 1
with tf.control_dependencies([new_v1, new_v2]):
    upd1 = tf.assign(v1, new_v1)
    upd2 = tf.assign(v2, new_v2)
init = tf.global_variables_initializer()

Here, the assignment operations cannot happen until the new values for v1 and v2 have been computed, so their final values will always be 1 in both cases.

like image 138
jdehesa Avatar answered Sep 19 '22 16:09

jdehesa