Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Overriding device scope in Tensorflow

Tags:

tensorflow

How is device scope handled in the following situation, where an outer device scope is overridden by an inner device scope:

with tf.device("/cpu:0"):
    a = function1()

    with tf.device("/gpu:0"):
        b = function2()

    with tf.device("/gpu:1"):
        c = function3()

    d = a+b+c

My intuition is as follows:

1) "a" is computed first on "cpu:0"

2) "b" and "c" and computed in parallel on "gpu:0" and "gpu:1", respectively.

3) "d" waits for "b" and "c" because it depends on them, and when their values become available, "d" gets computed on "cpu:0"

Is my intuition correct?

like image 859
jstaker7 Avatar asked Dec 19 '15 22:12

jstaker7


2 Answers

Mostly, with a few subtle points:

(a) "b" and "c" could be computed in parallel, provided there are no control flow dependencies or data dependencies in what they're doing. But whether they actually are executed truly at the same time is unpredictable from this example. (I assume that was already obvious, but I wanted to be sure it was to others who might read this later.)

Note also that as specified, b and c don't explicitly depend on a, so it's possible that all three of them could be executed concurrently. It is not the case that a must be executed first.

(b) By default, if you don't supply any configuration options, device placement is "soft" -- the runtime can override things if the op can't be executed on the specific device. For example, a CPU-only op could be moved from a GPU back to /cpu:0; or an op pinned to /gpu:1 could be moved to /gpu:0 if the graph was run on a machine that had only a single GPU.

You can control the hard-vs-soft placement by supplying a configuration to the tf.Session:

with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)):
like image 145
dga Avatar answered Oct 23 '22 10:10

dga


Yes.

PS, to check your intution, you could do something like this

with tf.device("/cpu:0"):
  a = tf.placeholder(dtype=tf.int32, name="a")
  with tf.device("/gpu:0"):
    b = tf.placeholder(dtype=tf.int32, name="b")
    with tf.device("/gpu:1"):
      c = tf.placeholder(dtype=tf.int32, name="c")
      d = a+b+c

print d.graph.as_graph_def()

This gives the underlying graph definition that TensorFlow system will run

like image 27
Yaroslav Bulatov Avatar answered Oct 23 '22 10:10

Yaroslav Bulatov