Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow: When are variable assignments done in sess.run with a list?

I have thought that variable assignments are done after all operations in a list given to sess.run, but the following code returns different results at different execution. It seems randomly run operations in the list and assign the variable after the run of the operation in the list.

a = tf.Variable(0)
b = tf.Variable(1)
c = tf.Variable(1)
update_a = tf.assign(a, b + c)
update_b = tf.assign(b, c + a)
update_c = tf.assign(c, a + b)

with tf.Session() as sess:
  sess.run(initialize_all_variables)
  for i in range(5):
    a_, b_, c_ = sess.run([update_a, update_b, update_c])

I'd like to know the timing of variable assignments. Which are correct: "update_x -> assign x -> ... -> udpate_z -> assign z" or "update_x -> udpate_y -> udpate_z -> assign a, b, c"? (where (x, y, z) is a permutation of (a, b, c)) In addition, if there is a way that realize the latter assignment (assignment are done after all operations in the list are done), please let me know how to realize it.

like image 637
Sarah Avatar asked Dec 22 '16 17:12

Sarah


2 Answers

The three operations update_a, update_b, and update_c have no interdependencies in the dataflow graph, so TensorFlow may choose to execute them in any order. (In the current implementation, it is possible that all three of them will be executed in parallel on different threads.) A second nit is that reads of variables are cached by default, so in your program the value assigned in update_b (i.e. c + a) may use the original or the updated value of a, depending on when the variable is first read.

If you want to ensure that the operations happen in a particular order, you can use with tf.control_dependencies([...]): blocks to enforce that operations created within the block happen after operations named in the list. You can use tf.Variable.read_value() inside a with tf.control_dependencies([...]): block to make the point at which the variable is read explicit.

Therefore, to if you want to ensure that update_a happens before update_b and update_b happens before update_c, you could do:

update_a = tf.assign(a, b + c)

with tf.control_dependencies([update_a]):
  update_b = tf.assign(b, c + a.read_value())

with tf.control_dependencies([update_b]):
  update_c = tf.assign(c, a.read_value() + b.read_value())
like image 63
mrry Avatar answered Sep 30 '22 23:09

mrry


Based on this example of yours',

v = tf.Variable(0)
c = tf.constant(3)
add = tf.add(v, c)
update = tf.assign(v, add)
mul = tf.mul(add, update)

with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())
    res = sess.run([mul, mul])
    print(res)

Output: [9, 9]

You get [9, 9] and this is in fact what we've asked it to do. Think of it like this:

During the run, once mul is taken from the list, it looks for the definition of this and finds tf.mul(add, update). Now, it needs the value of add which leads to tf.add(v, c). So, it plugs in the value of v and c, gets the value of add as 3.

Ok, now we need the value of update which is defined as tf.assign(v, add). We have values of both add (which it computed just now as 3) & v. So, it updates the value of v to be 3 which is also the value for update.

Now, it has values for both add and update which are 3. Thus, the multiplication yields 9 in mul.

Based on the result that we get, I think, for the next item(operation) in the list, it just returns the just computed value of mul. I'm not sure whether it does the steps again or just returns the same (cached?) value it just computed for mul realizing that we've the result or these operations happen in parallel(for each element in the list). Maybe @mrry or @YaroslavBulatov can comment on this part please?


Quoting @mrry's comment:

When you call sess.run([x, y, z]) once, TensorFlow executes each op that those tensors depend on one time only (unless there's a tf.while_loop() in your graph). If a tensor appears twice in the list (like mul in your example), TensorFlow will execute it once and return two copies of the result. To run the assignment more than once, you must either call sess.run() multiple times, or use tf.while_loop() to put a loop in your graph.

like image 39
kmario23 Avatar answered Sep 30 '22 21:09

kmario23