Inplementation of LSTM in Keras

Tags:

I'm using Keras using Tensorflow backend.

model = Sequential()
model.add(Masking(mask_value = 0., input_shape = (MAX_LENGTH, 1)))
model.add(LSTM(16, input_shape = (BATCH_SIZE, MAX_LENGTH, 1), return_sequences = False))
model.add(Dense(units = 2))
model.add(Activation("sigmoid"))
model.compile(loss = "binary_crossentropy", optimizer = "adam", metrics = ["accuracy"])

This python code works, but I wonder whether there are 16 LSTM blocks with 1 cell each, or 1 LSTM block with 16 cells.

Thanks in advance!

LSTM architecture

740

asked Feb 05 '19 21:02

3 Answers

Ok so your question got me thinking and I think I over did it but here goes nothing. Here's a snippet of code I did to get some insights behind the LSTM implementation.

from keras.layers import LSTM
from keras.models import Sequential

model = Sequential()
model.add(LSTM(10, input_shape=(20, 30), return_sequences=True))
model.compile(loss='mse',optimizer='adam', metrics=['accuracy'])
weights = model.get_weights()

Now, by inspecting the weights shapes we can get an intuition on what's happening.

In [12]: weights[0].shape
Out[12]: (30, 40)
In [14]: weights[1].shape
Out[14]: (10, 40)
In [15]: weights[2].shape
Out[15]: (40,)

And here is a description of them:

In [26]: model.weights
Out[26]: 
[<tf.Variable 'lstm_4/kernel:0' shape=(30, 40) dtype=float32_ref>,
 <tf.Variable 'lstm_4/recurrent_kernel:0' shape=(10, 40) dtype=float32_ref>,
 <tf.Variable 'lstm_4/bias:0' shape=(40,) dtype=float32_ref>]

Those are the only weights available. I also went to see the Keras implementation on https://github.com/keras-team/keras/blob/master/keras/layers/recurrent.py#L1765

So you can see that @gorjan was right, it implementes one cell, meaning the 4 gates (for the recurrent input as well as the sequence input), along with their biases.

The "layer" thinking here should be applied to the number of times the LSTM will be unrolled, in this case 30.

Hope this helps.

150

answered Sep 24 '22 02:09

Diego Aguado

It's for 1 block, 16 cells, afaik.

answered Sep 21 '22 02:09

When you are using cells LSTM, GRU, you don't have the notion of layers per se. What you actually have is a cell, that implements few gates. Each of the gates constitutes of a separate weight matrix that the model will learn during training. For example, in your case, what you will have is 1 cell, where each of the gates defined by matrices will have a dimension (feature_size_of_your_input, 16). I suggest that you read: http://colah.github.io/posts/2015-08-Understanding-LSTMs/ really carefully before you start implementing this kind of stuff. Otherwise, you are just using them as a black box model without understanding what is happening under the hood.

answered Sep 20 '22 02:09

gorjan

Related questions
                            
                                Custom Hebbian Layer Implementation in Keras - input/output dims and lateral node connections
                            
                                In Python, Is it possible to connect Azure SQL Server using Active Directory Password Authentication?
                            
                                How can I select the good colors from an image with OpenCV and mask?
                            
                                Pandas xlsxwriter to write dataframe to excel and implementing column-width and border related formatting
                            
                                K.<v> notation in Python 2
                            
                                Selenium with Firefox webdriver results in error: Service geckodriver unexpectedly exited. Status code was: -11
                            
                                How to replace special characters within a text with a space in Python?
                            
                                Get value of nested attribute by filtering list on other attribute with Python Glom
                            
                                How to resample text (imbalanced groups) in a pipeline?
                            
                                What does axis=[1,2,3] mean in K.sum in keras backend?
                            
                                How to use bearer authentication in openapi-codegen generated python code
                            
                                How to set same colors for same indexes in different charts in matplotlib and seaborn
                            
                                Difference between add_form and form
                            
                                What is numpy.mgrid, technically?
                            
                                OpenCV 4 TypeError: Expected cv::UMat for argument 'labels'
                            
                                Python subprocess.call with timeout retry
                            
                                Parallel Sklearn Model Building with Dask or Joblib
                            
                                How to remove last N lines from txt file with Python?
                            
                                prevent flask reload on change
                            
                                Duplicating previous day rows for all missing dates dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Inplementation of LSTM in Keras

Tags:

python

tensorflow

keras

lstm

I-was-a-Ki

People also ask

3 Answers

Diego Aguado

Slam

gorjan

Recent Activity

Donate For Us