Is there a way to parallelize stacked RNNs over multiple GPUs in TensorFlow?

Tags:

Is it possible to take the output of a tf.scan operation and stream it directly to a different GPU, effectively running two stacked RNNs on two GPUs in parallel? Something like this:

cell1 = tf.nn.rnn_cell.MultiRNNCell(..)
cell2 = tf.nn.rnn_cell.MultiRNNCell(..)

with tf.device("/gpu:0"):
  ys1 = tf.scan(lambda a, x: cell1(x, a[1]), inputs,
          initializer=(tf.zeros([batch_size, state_size]), init_state))

with tf.device("/gpu:1"):
  ys2 = tf.scan(lambda a, x: cell2(x, a[1]), ys1,
          initializer=(tf.zeros([batch_size, state_size]), init_state))

Will TensorFlow automatically take care of that optimization, or will it block the graph flow until the list ys1 is finalized.

718

asked Aug 29 '16 14:08

Lenar Hoyt

1 Answers

Unfortunately, tf.scan has a "boundary" at the output, and all iterations have to complete before the output tensor can be read by the next operations. However, you can run the different levels of your lstm stack on different GPUs, and get frame parallelism within a scan. Write your own version of MultiRNNCell to use separate devices for each lstm layer.

Also you probably want to use tf.nn.dynamic_rnn instead of scan.

answered Oct 03 '22 02:10

Eugene Brevdo

Related questions
                            
                                LOCK prefix of Intel instruction. What is the point?
                            
                                how to avoid race conditions with scala parallel collections
                            
                                Compiling code containing dynamic parallelism fails
                            
                                Assigning Multiple Cores to a Python Program
                            
                                Caret train function - unable to find variable "optimismBoot"
                            
                                Minimal "Task Queue" with stock Linux tools to leverage Multicore CPU
                            
                                Parallelizing massive inserts in SQL Server from C# (for better time performance)
                            
                                Return value and keyword inside a Parallel ForEach or For loop
                            
                                How to wait all batch files to finish before exiting?
                            
                                .net Core Parallel.ForEach issues
                            
                                sorting 50 000 000 numbers
                            
                                Strange environment behavior in parallel plyr
                            
                                Can you parallelise inexact Jacobian calculation in Julia without special arrays?
                            
                                Which implementations of Condition do not require current thread to hold the lock?
                            
                                Zero downtime deployment of Slack bot

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there a way to parallelize stacked RNNs over multiple GPUs in TensorFlow?

Tags:

parallel-processing

tensorflow

recurrent-neural-network

Lenar Hoyt

People also ask

1 Answers

Eugene Brevdo

Recent Activity

Donate For Us