I am using deep learning library keras and trying to stack multiple LSTM with no luck. Below is my code
model = Sequential() model.add(LSTM(100,input_shape =(time_steps,vector_size))) model.add(LSTM(100))
The above code returns error in the third line Exception: Input 0 is incompatible with layer lstm_28: expected ndim=3, found ndim=2
The input X is a tensor of shape (100,250,50). I am running keras on tensorflow backend
The Solution. The solution is to add return_sequences=True to all LSTM layers except the last one so that its output tensor has ndim=3 (i.e. batch size, timesteps, hidden state). Setting this flag to true lets Keras know that LSTM output should contain all historical generated outputs along with time stamps (3D).
Running the example outputs a single value for the input sequence as a 2D array. To stack LSTM layers, we need to change the configuration of the prior LSTM layer to output a 3D array as input for the subsequent layer. We can do this by setting the return_sequences argument on the layer to True (defaults to False).
"Stacking LSTM hidden layers makes the model deeper, more accurately earning the description as a deep learning technique ... The additional hidden layers are understood to recombine the learned representation from prior layers and create new representations at high levels of abstraction.
The original LSTM model is comprised of a single hidden LSTM layer followed by a standard feedforward output layer. The stacked LSTM is an extension to this model that has multiple hidden LSTM layers where each layer contains multiple memory cells.
Implement Stacked LSTMs in Keras. Each LSTMs memory cell requires a 3D input. When an LSTM processes one input sequence of time steps, each memory cell will output a single value for the whole sequence as a 2D array. We can demonstrate this below with a model that has a single hidden LSTM layer that is also the output layer.
Gentle introduction to the Stacked LSTM with example code in Python. The original LSTM model is comprised of a single hidden LSTM layer followed by a standard feedforward output layer. The Stacked LSTM is an extension to this model that has multiple hidden LSTM layers where each layer contains multiple memory cells.
To stack LSTM layers, we need to change the configuration of the prior LSTM layer to output a 3D array as input for the subsequent layer. We can do this by setting the return_sequences argument on the layer to True (defaults to False). This will return one output for each input time step and provide a 3D array.
We can continue to add hidden LSTM layers as long as the prior LSTM layer provides a 3D output as input for the subsequent layer; for example, below is a Stacked LSTM with 4 hidden layers.
You need to add return_sequences=True
to the first layer so that its output tensor has ndim=3
(i.e. batch size, timesteps, hidden state).
Please see the following example:
# expected input data shape: (batch_size, timesteps, data_dim) model = Sequential() model.add(LSTM(32, return_sequences=True, input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32 model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32 model.add(LSTM(32)) # return a single vector of dimension 32 model.add(Dense(10, activation='softmax'))
From: https://keras.io/getting-started/sequential-model-guide/ (search for "stacked lstm")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With