Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the meaning of multiple kernels in keras lstm layer?

On https://keras.io/layers/recurrent/ I see that LSTM layers have a kernel and a recurrent_kernel. What is their meaning? In my understanding, we need weights for the 4 gates of an LSTM cell. However, in keras implementation, kernel has a shape of (input_dim, 4*units) and recurrent_kernel has a shape of (units, 4*units). So, are both of them somehow implementing the gates?

like image 997
KrawallKurt Avatar asked Apr 17 '19 08:04

KrawallKurt


People also ask

Why are there multiple LSTM layers?

In that case the main reason for stacking LSTM is to allow for greater model complexity. In case of a simple feedforward net we stack layers to create a hierarchical feature representation of the input data to then use for some machine learning task. The same applies for stacked LSTM's.

What does number of units in LSTM mean?

The number of units is the number of neurons connected to the layer holding the concatenated vector of hidden state and input (the layer holding both red and green circles below). In this example, there are 2 neurons connected to that layer.

What does Keras LSTM layer do?

Long Short-Term Memory Network or LSTM, is a variation of a recurrent neural network (RNN) that is quite effective in predicting the long sequences of data like sentences and stock prices over a period of time. It differs from a normal feedforward network because there is a feedback loop in its architecture.

What is the output of LSTM layer in Keras?

The size of output is 2D array of real numbers. The second dimension is the dimensionality of the output space defined by the units parameter in Keras LSTM implementation.

How do you use LSTM in keras?

Building the LSTM in Keras First, we add the Keras LSTM layer, and following this, we add dropout layers for prevention against overfitting. For the LSTM layer, we add 50 units that represent the dimensionality of outer space. The return_sequences parameter is set to true for returning the last output in output.

What are layers in keras?

Keras layers API. Layers are the basic building blocks of neural networks in Keras. A layer consists of a tensor-in tensor-out computation function (the layer's call method) and some state, held in TensorFlow variables (the layer's weights ). A Layer instance is callable, much like a function:

What is return_sequences=true in keras?

In Keras, you can demand the layer to return a Sequence instead of the last timestep’s output by turning return_sequences=True. This is required when you are stacking multiple LSTM together. Because LSTM requires an input of three dimension tensor.

Why is Keras the preferred deep learning framework?

This ease of creating neural networks is what makes Keras the preferred deep learning framework by many. There are different types of Keras layers available for different purposes while designing your neural network architecture.


1 Answers

Correct me if I'm wrong, but if you take a look at the LSTM equations:

enter image description here

You have 4 W matrices that transform the input and 4 U matrices that transform the hidden state.

Keras saves these sets of 4 matrices into the kernel and recurrent_kernel weight arrays. From the code that uses them:

self.kernel_i = self.kernel[:, :self.units]
self.kernel_f = self.kernel[:, self.units: self.units * 2]
self.kernel_c = self.kernel[:, self.units * 2: self.units * 3]
self.kernel_o = self.kernel[:, self.units * 3:]

self.recurrent_kernel_i = self.recurrent_kernel[:, :self.units]
self.recurrent_kernel_f = self.recurrent_kernel[:, self.units: self.units * 2]
self.recurrent_kernel_c = self.recurrent_kernel[:, self.units * 2: self.units * 3]
self.recurrent_kernel_o = self.recurrent_kernel[:, self.units * 3:]

Apparently the 4 matrices are stored inside the weight arrays concatenated along the second dimension, which explains the weight array shapes.

like image 145
Stefan Dragnev Avatar answered Nov 10 '22 21:11

Stefan Dragnev