Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use the Embedding Layer for Recurrent Neural Network (RNN) in Keras

I'm rather new to Neural Networks and the Keras Library and I'm wondering how I can use the Embedding Layer as described here to mask my input data from a 2D tensor to a 3D tensor for a RNN.

Say my timeseries data looking as follows (with an increasing time):

X_train = [
   [1.0,2.0,3.0,4.0],
   [2.0,5.0,6.0,7.0],
   [3.0,8.0,9.0,10.0],
   [4.0,11.0,12.0,13.0],
   ...
] # with a length of 1000

Now, say I would want to give the RNN the last 2 feature vectors in order to predict the feature vector for time t+1.

Currently (without the Embedding Layer), I am creating the required 3D tensor with shape (nb_samples, timesteps, input_dim) myself (as in this example here).

Related to my example, the final 3D Tensor would then look as follows:

X_train_2 = [
  [[1.0,2.0,3.0,4.0],
   [2.0,5.0,6.0,7.0]],
  [[2.0,5.0,6.0,7.0],
   [3.0,8.0,9.0,10.0]],
  [[3.0,8.0,9.0,10.0],
   [4.0,11.0,12.0,13.0]],
  etc...
]

and Y_train:

Y_train = [
   [3.0,8.0,9.0,10.0],
   [4.0,11.0,12.0,13.0],
   etc...
]

My model looks as follows (adapted to the simplified example above):

num_of_vectors = 2
vect_dimension = 4

model = Sequential()
model.add(SimpleRNN(hidden_neurons, return_sequences=False, input_shape=(num_of_vectors, vect_dimension))) 
model.add(Dense(vect_dimension))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.fit(X_train, Y_train, batch_size=50, nb_epoch=10, validation_split=0.15)

And finally, my question would be, how can I avoid doing those 2D tensor to 3D tensor reshaping myself and use the Embedding layer instead? I guess after model = sequential() I would have to add something like:

model.add(Embedding(?????))

Probably the answer is rather simple, I'm simply confused by the documentation of the embedding layer.

like image 473
Kito Avatar asked Jan 29 '16 16:01

Kito


2 Answers

You can you it as follows:

Note:

  1. I generated some X and y as 0s just to give you some idea of the input structure.

  2. If you are having a multi class y_train, you will need to binarize.

  3. You might need to add padding if you have data of various length.

  4. If I understood correctly about predicting at time t+1, you might want to look at Sequence to Sequence learning.

Try something like:

hidden_neurons = 4
nb_classes =3
embedding_size =10

X = np.zeros((128, hidden_neurons), dtype=np.float32)
y = np.zeros((128, nb_classes), dtype=np.int8)


model = Sequential()
model.add(Embedding(hidden_neurons, embedding_size))
model.add(SimpleRNN(hidden_neurons, return_sequences=False)) 
model.add(Dense(nb_classes))
model.add(Activation("softmax"))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', class_mode="categorical")
model.fit(X, y, batch_size=1, nb_epoch=1)
like image 111
nog Avatar answered Oct 23 '22 03:10

nog


From what I know so far, the Embedding layer seems to be more or less for dimensionality reduction like word embedding. So in this sense it does not seem applicable as general reshaping tool.

Basicaly if you have a mapping of words to integers like {car: 1, mouse: 2 ... zebra: 9999}, your input text would be vector of words represented by they integer id, like [1, 2, 9999 ...], which would mean [car, mouse, zebra ...]. But it seems to be efficient to map words to vectors of real numbers with length of the vocabular, so if your text has 1000 unique words, you would map each word to vector of real numbers with length of 1000. I am not sure but I think that it mostly represents a weigh of how similar a meaning of a word is to all other words, but I am not sure that is right and wheher there are other ways to embed words.

like image 40
aocenas Avatar answered Oct 23 '22 05:10

aocenas