Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's state_size of a MultiRNNCell in TensorFlow?

Tags:

tensorflow

With the config as below for a BasicLSTM cell:

...
num_layers = 2 
num_steps = 10 
hidden_size = 200    
...

I use 2-hidden layers model:

lstm_cell = rnn_cell.BasicLSTMCell(hidden_size, forget_bias=0.0) 
cell = rnn_cell.MultiRNNCell([lstm_cell] * 2)

What is the cell.state_size?

I got it as 30 x 800, but I can't understand how it comes to?

PS: refer to the source code in https://github.com/tensorflow/tensorflow/blob/97f585d506cccc57dc98f234f4d5fcd824dd3c03/tensorflow/python/ops/rnn_cell.py#L353

It seems to return statesize as 2 * unitsize. But why should the state size be twice of the unit size?

like image 715
zshtom Avatar asked Apr 20 '16 02:04

zshtom


1 Answers

For a single BasicLSTMCell, the state is a tuple of (c=200, h=200), in your case. c is the cell state of 200 units (neurons) and h is the hidden state of 200 units.

To understand this, consider a vanilla RNN cell. It only has one hidden state being passed from one-time step to the next. This is the case for BasicRNNCell implemented in TensorFlow. Its state is a single integer of h=200 if you do tf.nn.rnn_cell.BasicRNNCell(200).

LSTM adds an extra cell layer for longitudinal memory with the size same as the hidden layer, so the overall state for LSTM is 2x200 = 400.

The introduction part of this paper might be conducive.


Have to say the doc of TensorFlow is a bit too concise for beginners.

like image 78
Luke Guye Avatar answered Sep 23 '22 07:09

Luke Guye