Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write an LSTM in Keras without an Embedding layer?

Tags:

keras

theano

How do you write a simple sequence copy task in keras using the LSTM architecture without an Embedding layer? I already have the word vectors.

like image 435
user2879934 Avatar asked May 11 '16 18:05

user2879934


People also ask

Why do we need to embed LSTM?

An LSTM network is a type of recurrent neural network (RNN) that can learn long-term dependencies between time steps of sequence data. A word embedding layer maps a sequence of word indices to embedding vectors and learns the word embedding during training. This layer requires Deep Learning Toolbox™.

Why do we need embedding layer?

Embedding layer enables us to convert each word into a fixed length vector of defined size. The resultant vector is a dense one with having real values instead of just 0's and 1's. The fixed length of word vectors helps us to represent words in a better way along with reduced dimensions.


2 Answers

If you say that you have word vectors, I guess you have a dictionary to map a word to its vector representation (calculated from word2vec, GloVe...).

Using this dictionary, you replace all words in your sequence by their corresponding vectors. You will also need to make all your sequences the same length, since LSTMs need all the input sequences to be of constant length. So you will need to determine a max_length value and trim all sequences that are longer and pad all sequences that are shorter with zeros (see Keras pad_sequences function).

Then you can pass the vectorized sequences directly to the LSTM layer of your neural network. Since the LSTM layer is the first layer of the network, you will need to define the input shape, which in your case is (max_length, embedding_dim, ).

This ways you skip the Embedding layer and use your own precomputed word vectors instead.

like image 198
Lorrit Avatar answered Oct 05 '22 05:10

Lorrit


I had same problem after searching in Keras at "Stacked LSTM for sequence classification" part , I found following code might be useful: model = Sequential() model.add(LSTM(3NumberOfLSTM, return_sequences=True, input_shape=(YourSequenceLenght, YourWord2VecLenght)))

like image 33
hmfarimani Avatar answered Oct 05 '22 04:10

hmfarimani