In tensorflow, what is the difference between tf.nn.static_rnn
and tf.nn.dynamic_rnn
, and when to use them?
Both take a sequence_length
argument that adapts the computation to the actual length of the input; it is not as if static_rnn
is limited to fixed-size inputs, right?
dynamic_rnn
has the following extra arguments:
parallel_iterations
swap_memory
time_major
But I suppose these are only minor differences.
So what is the main difference between tf.nn.static_rnn
and tf.nn.dynamic_rnn
and when should we use one over the other?
This is still a useful resource (despite being written a couple years ago): http://www.wildml.com/2016/08/rnns-in-tensorflow-a-practical-guide-and-undocumented-features/
In it, Denny Britz has the following comment on the static/dynamic issue:
Static
Internally,
tf.nn.rnn
creates an unrolled graph for a fixed RNN length. That means, if you calltf.nn.rnn
with inputs having 200 time steps you are creating a static graph with 200 RNN steps. First, graph creation is slow. Second, you’re unable to pass in longer sequences (> 200) than you’ve originally specified.
Dynamic
tf.nn.dynamic_rnn
solves this. It uses atf.While
loop to dynamically construct the graph when it is executed. That means graph creation is faster and you can feed batches of variable size.
In general he concludes that there is no real benefit in using tf.nn.static_rnn
and that for most cases you'll want to resort to tf.nn.dynamic_rnn
For what it's worth, I've had the same experience myself.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With