Recently, I try to learn how to use Tensorflow on multiple GPU to accelerate training speed. I found an official tutorial about training classification model based on Cifar10 dataset. However, I found that this tutorial reads image by using the queue. Out of curiosity, how can I use multiple GPU by feeding value into Session? It seems that it is hard for me to solve the problem that feeds different value from the same dataset to different GPU. Thank you, everybody! The following code is about part of the official tutorial.
images, labels = cifar10.distorted_inputs()
batch_queue = tf.contrib.slim.prefetch_queue.prefetch_queue(
[images, labels], capacity=2 * FLAGS.num_gpus)
# Calculate the gradients for each model tower.
tower_grads = []
with tf.variable_scope(tf.get_variable_scope()):
for i in xrange(FLAGS.num_gpus):
with tf.device('/gpu:%d' % i):
with tf.name_scope('%s_%d' % (cifar10.TOWER_NAME, i)) as scope:
# Dequeues one batch for the GPU
image_batch, label_batch = batch_queue.dequeue()
# Calculate the loss for one tower of the CIFAR model. This function
# constructs the entire CIFAR model but shares the variables across
# all towers.
loss = tower_loss(scope, image_batch, label_batch)
# Reuse variables for the next tower.
tf.get_variable_scope().reuse_variables()
# Retain the summaries from the final tower.
summaries = tf.get_collection(tf.GraphKeys.SUMMARIES, scope)
# Calculate the gradients for the batch of data on this CIFAR tower.
grads = opt.compute_gradients(loss)
# Keep track of the gradients across all towers.
tower_grads.append(grads)
The core idea of the multi-GPU example is that you explicitly assign operations to a tf.device
. The example loops over FLAGS.num_gpus
devices and creates a replica for each of the GPUs.
If you create placeholder ops inside the for loop, they will get assigned to their respective devices. All you need to do is keep handles to the created placeholders and then feed them all independently in a single session.run
call.
placeholders = []
for i in range(FLAGS.num_gpus):
with tf.device('/gpu:%d' % i):
plc = tf.placeholder(tf.int32)
placeholders.append(plc)
with tf.Session() as sess:
fd = {plc: i for i, plc in enumerate(placeholders)}
sess.run(sum(placeholders), feed_dict=fd) # this should give you the sum of all
# numbers from 0 to FLAGS.num_gpus - 1
To address your specific example, it should suffice to replace the batch_queue.dequeue()
call with the construction of two placeholders (for image_batch
and label_batch
tensors), store these placeholders somewhere, and then feed the values you need to those.
Another (somewhat hacky) way is to override the image_batch
and label_batch
tensors directly in the session.run
call, because you can feed_dict any tensor (not just a placeholder). You will still need to store the tensors somewhere to be able to reference them from the run
call.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With