Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In distributed TensorFlow, is it possible to share the same queue across different workers?

Tags:

tensorflow

In TensorFlow, I want to have a filename queue shared across different workers on different machines, such that each machine can get a subset of files to train. I searched a lot, and it seems that only variables could be put on a PS task to be shared. Does anyone have any example? Thanks.

like image 350
kopopt Avatar asked Sep 19 '16 17:09

kopopt


People also ask

Is TensorFlow distributed?

TensorFlow supports distributed computing, allowing portions of the graph to be computed on different processes, which may be on completely different servers! In addition, this can be used to distribute computation to servers with powerful GPUs, and have other computations done on servers with more memory, and so on.

How does TensorFlow parallelize?

The TensorFlow runtime parallelizes graph execution across many different dimensions: The individual ops have parallel implementations, using multiple cores in a CPU, or multiple threads in a GPU.


1 Answers

It is possible to share the same queue across workers, by setting the optional shared_name argument when creating the queue. Just as with tf.Variable objects, you can place the queue on any device that can be accessed from different workers. For example:

with tf.device("/job:ps/task:0"):  # Place queue on parameter server.
  q = tf.FIFOQueue(..., shared_name="shared_queue")

A few notes:

  • The value for shared_name must be unique to the particular queue that you are sharing. Unfortunately, the Python API does not currently use scoping or automatic name uniqification to make this easier, so you will have to ensure this manually.

  • You do not need to place the queue on a parameter server. One possible configuration would be to set up an additional "input job" (e.g. "/job:input") containing a set of tasks that perform pre-processing, and export a shared queue for the workers to use.

like image 159
mrry Avatar answered Sep 27 '22 23:09

mrry