What is the meaning of multiple kernels in keras lstm layer?

Tags:

On https://keras.io/layers/recurrent/ I see that LSTM layers have a kernel and a recurrent_kernel. What is their meaning? In my understanding, we need weights for the 4 gates of an LSTM cell. However, in keras implementation, kernel has a shape of (input_dim, 4*units) and recurrent_kernel has a shape of (units, 4*units). So, are both of them somehow implementing the gates?

997

asked Apr 17 '19 08:04

KrawallKurt

1 Answers

Correct me if I'm wrong, but if you take a look at the LSTM equations:

enter image description here

You have 4 W matrices that transform the input and 4 U matrices that transform the hidden state.

Keras saves these sets of 4 matrices into the kernel and recurrent_kernel weight arrays. From the code that uses them:

self.kernel_i = self.kernel[:, :self.units]
self.kernel_f = self.kernel[:, self.units: self.units * 2]
self.kernel_c = self.kernel[:, self.units * 2: self.units * 3]
self.kernel_o = self.kernel[:, self.units * 3:]

self.recurrent_kernel_i = self.recurrent_kernel[:, :self.units]
self.recurrent_kernel_f = self.recurrent_kernel[:, self.units: self.units * 2]
self.recurrent_kernel_c = self.recurrent_kernel[:, self.units * 2: self.units * 3]
self.recurrent_kernel_o = self.recurrent_kernel[:, self.units * 3:]

Apparently the 4 matrices are stored inside the weight arrays concatenated along the second dimension, which explains the weight array shapes.

145

answered Nov 10 '22 21:11

Stefan Dragnev

Related questions
                            
                                InvalidArgumentError: input_1:0 is both fed and fetched
                            
                                Using sample_weights with fit_generator()
                            
                                Multiple inputs of keras model with tf.data.Dataset.from_generator in Tensorflow 2
                            
                                AttributeError when using callback Tensorboard on Keras: 'Model' object has no attribute 'run_eagerly'
                            
                                How to input data into Keras? Specifically what is the x_train and y_train if I have more than 2 columns?
                            
                                Loading weights in TH format when keras is set to TF format
                            
                                use tensorflow.GPUOptions within Keras when using tensorflow backend
                            
                                How to get list of values in ImageDataGenerator.flow_from_directory Keras?
                            
                                batch_input_shape tuple on Keras LSTM
                            
                                LSTM Initial state from Dense layer
                            
                                Keras : AttributeError: 'int' object has no attribute 'ndim' when using model.fit
                            
                                How do I stop Keras showing "using XXX backend"?
                            
                                tokenizer.texts_to_sequences Keras Tokenizer gives almost all zeros
                            
                                Tensorflow/keras: "logits and labels must have the same first dimension" How to squeeze logits or expand labels?
                            
                                Unable to import Keras(from TensorFlow 2.0) in PyCharm 2019.2
                            
                                Keras 'set_session' not available for Tensorflow 2.0
                            
                                How to calculate input_dim for a keras sequential model?
                            
                                Keras: How to get layer index when already know layer name?
                            
                                How to build a Language model using LSTM that assigns probability of occurence for a given sentence
                            
                                When to use GlobalAveragePooling1D and when to use GlobalMaxPooling1D while using Keras for an LSTM model?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the meaning of multiple kernels in keras lstm layer?

Tags:

neural-network

keras

lstm

KrawallKurt

People also ask

1 Answers

Stefan Dragnev

Recent Activity

Donate For Us