With the config as below for a BasicLSTM
cell:
...
num_layers = 2
num_steps = 10
hidden_size = 200
...
I use 2-hidden layers model:
lstm_cell = rnn_cell.BasicLSTMCell(hidden_size, forget_bias=0.0)
cell = rnn_cell.MultiRNNCell([lstm_cell] * 2)
What is the cell.state_size
?
I got it as 30 x 800, but I can't understand how it comes to?
PS: refer to the source code in https://github.com/tensorflow/tensorflow/blob/97f585d506cccc57dc98f234f4d5fcd824dd3c03/tensorflow/python/ops/rnn_cell.py#L353
It seems to return statesize
as 2 * unitsize
. But why should the state size be twice of the unit size?
For a single BasicLSTMCell
, the state is a tuple of (c=200, h=200)
, in your case. c
is the cell state of 200 units (neurons) and h
is the hidden state of 200 units.
To understand this, consider a vanilla RNN cell. It only has one hidden state being passed from one-time step to the next. This is the case for BasicRNNCell
implemented in TensorFlow. Its state is a single integer of h=200 if you do tf.nn.rnn_cell.BasicRNNCell(200)
.
LSTM adds an extra cell layer for longitudinal memory with the size same as the hidden layer, so the overall state for LSTM is 2x200 = 400.
The introduction part of this paper might be conducive.
Have to say the doc of TensorFlow is a bit too concise for beginners.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With