In distributed TensorFlow, is it possible to share the same queue across different workers?

Tags:

tensorflow

In TensorFlow, I want to have a filename queue shared across different workers on different machines, such that each machine can get a subset of files to train. I searched a lot, and it seems that only variables could be put on a PS task to be shared. Does anyone have any example? Thanks.

350

asked Sep 19 '16 17:09

kopopt

1 Answers

It is possible to share the same queue across workers, by setting the optional shared_name argument when creating the queue. Just as with tf.Variable objects, you can place the queue on any device that can be accessed from different workers. For example:

with tf.device("/job:ps/task:0"):  # Place queue on parameter server.
  q = tf.FIFOQueue(..., shared_name="shared_queue")

A few notes:

The value for shared_name must be unique to the particular queue that you are sharing. Unfortunately, the Python API does not currently use scoping or automatic name uniqification to make this easier, so you will have to ensure this manually.
You do not need to place the queue on a parameter server. One possible configuration would be to set up an additional "input job" (e.g. "/job:input") containing a set of tasks that perform pre-processing, and export a shared queue for the workers to use.

159

answered Sep 27 '22 23:09

mrry

Related questions
                            
                                How do you compute accuracy in a regression model, after rounding predictions to classes, in keras?
                            
                                Pycharm tensorflow ImportError but works fine with Terminal
                            
                                How to calculate the output size after convolving and pooling to the input image
                            
                                Why does a binary Keras CNN always predict 1?
                            
                                How to use multilayered bidirectional LSTM in Tensorflow?
                            
                                How to use Dataset API to read TFRecords file of lists of variant length?
                            
                                Cannot load tensorflow_hub
                            
                                WARNING:tensorflow:Ignoring detection with image id despite true config parameters
                            
                                What is the backward process of max operation in deep learning?
                            
                                Keras: rescale=1./255 vs preprocessing_function=preprocess_input - which one to use?
                            
                                How to use Transformers for text classification?
                            
                                TypeError: ('Keyword argument not understood:', 'inputs')
                            
                                How to get VirtualEnv TensorFlow to work in PyCharm?
                            
                                How to read weights saved in tensorflow checkpoint file?
                            
                                Tensorflow dynamic RNN (LSTM): how to format input?
                            
                                scheduled sampling in Tensorflow
                            
                                How is the categorical_crossentropy implemented in keras?
                            
                                How to feed input with changing size in Tensorflow
                            
                                Find input that maximises output of a neural network using Keras and TensorFlow
                            
                                Understanding Tensorflow control dependencies

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With