Cannot stack LSTM with MultiRNNCell and dynamic_rnn

Tags:

I am trying to build a multivariate time series prediction model. I followed the following tutorial for temperature prediction. http://nbviewer.jupyter.org/github/addfor/tutorials/blob/master/machine_learning/ml16v04_forecasting_with_LSTM.ipynb

I want to extend his model to multilayer LSTM model by using following code:

cell = tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True)  
cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers,state_is_tuple=True)  
output, _ = tf.nn.dynamic_rnn(cell=cell, inputs=features, dtype=tf.float32)

but I have an error saying:

ValueError: Dimensions must be equal, but are 256 and 142 for 'rnn/while/rnn/multi_rnn_cell/cell_0/cell_0/lstm_cell/MatMul_1' (op: 'MatMul') with input shapes: [?,256], [142,512].

When I tried this:

cell = []
for i in range(num_layers):
    cell.append(tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True))
cell = tf.contrib.rnn.MultiRNNCell(cell,state_is_tuple=True)
output, _ = tf.nn.dynamic_rnn(cell=cell, inputs=features, dtype=tf.float32)

I do not have such error but the prediction is really bad.

I define hidden=128.

features = tf.reshape(features, [-1, n_steps, n_input]) has shape (?,1,14) for single layer case.

my data look like this x.shape=(594,14), y.shape=(591,1)

I am so confused how to stack LSTM cell in tensorflow. My tensorflow version is 0.14.

494

asked Nov 18 '17 22:11

zdarktknight

1 Answers

This is a very interesting question. Initially, I thought that two codes produce the same output (i.e stacking two LSTM cells).

code 1

cell = tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True)  
cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers,state_is_tuple=True)
print(cell)

code 2

cell = []
for i in range(num_layers):
    cell.append(tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True))
cell = tf.contrib.rnn.MultiRNNCell(cell,state_is_tuple=True)
print(cell)

However, If you print the cell in both instances produce something like following,

code 1

[<tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>, <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>]

code 2

[<tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>, <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D708B00>]

If you closely observe the results,

For code 1, prints a list of two LSTM cell objects and one object is the copy of other (since the pointers of the two objects are same)
For code 2 prints a list of two different LSTM cell objects (since the pointers of two objects are different).

Stacking two LSTM cells is something like below,

enter image description here

Therefore, If you think about the big picture (actual Tensorflow operation may be different), what it does is,

First map inputs to LSTM cell 1 hidden units (in your case 14 to 128).
Second, map hidden units of LSTM cell 1 to hidden units of LSTM cell 2 (in your case 128 to 128) .

Therefore, when you trying to do the above two operations to the same copy of LSTM cell (since the dimensions of weight matrices are different), there is an error.

However, if you use the number of hidden units as same the number input units (in your case input is 14 and hidden is 14) there is no error (since the dimensions of weight matrices are the same) although you are using the same LSTM cell.

Therefore, I think your second approach is correct if you are thinking of stacking two LSTM cells.

answered Nov 28 '22 18:11

Nipun Wijerathne

Related questions
                            
                                Tensorflow: Can I combine several name_scopes into one for optimizer update?
                            
                                how to change 2D Eigen::Tensor to Eigen::Matrix
                            
                                Loading weights in TH format when keras is set to TF format
                            
                                Determinism in tensorflow gradient updates?
                            
                                How to extract the cell state and hidden state from an RNN model in tensorflow?
                            
                                use tensorflow.GPUOptions within Keras when using tensorflow backend
                            
                                Keras : AttributeError: 'int' object has no attribute 'ndim' when using model.fit
                            
                                How to use Exponential Moving Average in Tensorflow
                            
                                Converting .tflite to .pb
                            
                                Tensorflow/keras: "logits and labels must have the same first dimension" How to squeeze logits or expand labels?
                            
                                Unable to import Keras(from TensorFlow 2.0) in PyCharm 2019.2
                            
                                Unavailable to install Tensorflow 1.x on Ubuntu 20.04 LTS using pip
                            
                                Cannot find any scalar summaries in TensorBoard
                            
                                Weights in Convolutional network?
                            
                                Tensorflow: Using neural network to classify positive or negative phrases
                            
                                How to calculate input_dim for a keras sequential model?
                            
                                Where is Wengert List in TensorFlow?
                            
                                Matrix norm in TensorFlow
                            
                                Can I use Layer Normalization with CNN?
                            
                                Error when parsing graph_def from string

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cannot stack LSTM with MultiRNNCell and dynamic_rnn

Tags:

tensorflow

lstm

multi-layer

zdarktknight

People also ask

1 Answers

Nipun Wijerathne

Recent Activity

Donate For Us