I have thought that variable assignments are done after all operations in a list given to sess.run, but the following code returns different results at different execution. It seems randomly run operations in the list and assign the variable after the run of the operation in the list.
a = tf.Variable(0)
b = tf.Variable(1)
c = tf.Variable(1)
update_a = tf.assign(a, b + c)
update_b = tf.assign(b, c + a)
update_c = tf.assign(c, a + b)
with tf.Session() as sess:
sess.run(initialize_all_variables)
for i in range(5):
a_, b_, c_ = sess.run([update_a, update_b, update_c])
I'd like to know the timing of variable assignments. Which are correct: "update_x -> assign x -> ... -> udpate_z -> assign z" or "update_x -> udpate_y -> udpate_z -> assign a, b, c"? (where (x, y, z) is a permutation of (a, b, c)) In addition, if there is a way that realize the latter assignment (assignment are done after all operations in the list are done), please let me know how to realize it.
The three operations update_a
, update_b
, and update_c
have no interdependencies in the dataflow graph, so TensorFlow may choose to execute them in any order. (In the current implementation, it is possible that all three of them will be executed in parallel on different threads.) A second nit is that reads of variables are cached by default, so in your program the value assigned in update_b
(i.e. c + a
) may use the original or the updated value of a
, depending on when the variable is first read.
If you want to ensure that the operations happen in a particular order, you can use with tf.control_dependencies([...]):
blocks to enforce that operations created within the block happen after operations named in the list. You can use tf.Variable.read_value()
inside a with tf.control_dependencies([...]):
block to make the point at which the variable is read explicit.
Therefore, to if you want to ensure that update_a
happens before update_b
and update_b
happens before update_c
, you could do:
update_a = tf.assign(a, b + c)
with tf.control_dependencies([update_a]):
update_b = tf.assign(b, c + a.read_value())
with tf.control_dependencies([update_b]):
update_c = tf.assign(c, a.read_value() + b.read_value())
Based on this example of yours',
v = tf.Variable(0)
c = tf.constant(3)
add = tf.add(v, c)
update = tf.assign(v, add)
mul = tf.mul(add, update)
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
res = sess.run([mul, mul])
print(res)
Output: [9, 9]
You get [9, 9]
and this is in fact what we've asked it to do. Think of it like this:
During the run, once mul
is taken from the list, it looks for the definition of this and finds tf.mul(add, update)
. Now, it needs the value of add
which leads to tf.add(v, c)
. So, it plugs in the value of v
and c
, gets the value of add
as 3.
Ok, now we need the value of update
which is defined as tf.assign(v, add)
. We have values of both add
(which it computed just now as 3) & v
. So, it updates the value of v
to be 3 which is also the value for update
.
Now, it has values for both add
and update
which are 3. Thus, the multiplication yields 9 in mul
.
Based on the result that we get, I think, for the next item(operation) in the list, it just returns the just computed value of mul
. I'm not sure whether it does the steps again or just returns the same (cached?) value it just computed for mul
realizing that we've the result or these operations happen in parallel(for each element in the list). Maybe @mrry or @YaroslavBulatov can comment on this part please?
Quoting @mrry's comment:
When you call
sess.run([x, y, z])
once, TensorFlow executes each op that those tensors depend on one time only (unless there's atf.while_loop()
in your graph). If a tensor appears twice in the list (likemul
in your example), TensorFlow will execute it once and return two copies of the result. To run the assignment more than once, you must either callsess.run()
multiple times, or usetf.while_loop()
to put a loop in your graph.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With