I am trying to do some vanilla pattern recognition with an LSTM using Keras to predict the next element in a sequence.
My data look like this:
where the label of the training sequence is the last element in the list: X_train['Sequence'][n][-1]
.
Because my Sequence
column can have a variable number of elements in the sequence, I believe an RNN to be the best model to use. Below is my attempt to build an LSTM in Keras:
# Build the model # A few arbitrary constants... max_features = 20000 out_size = 128 # The max length should be the length of the longest sequence (minus one to account for the label) max_length = X_train['Sequence'].apply(len).max() - 1 # Normal LSTM model construction with sigmoid activation model = Sequential() model.add(Embedding(max_features, out_size, input_length=max_length, dropout=0.2)) model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) model.add(Dense(1)) model.add(Activation('sigmoid')) # try using different optimizers and different optimizer configs model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
And here's how I attempt to train my model:
# Train the model for seq in X_train['Sequence']: print("Length of training is {0}".format(len(seq[:-1]))) print("Training set is {0}".format(seq[:-1])) model.fit(np.array([seq[:-1]]), [seq[-1]])
My output is this:
Length of training is 13 Training set is [1, 3, 13, 87, 1053, 28576, 2141733, 508147108, 402135275365, 1073376057490373, 9700385489355970183, 298434346895322960005291, 31479360095907908092817694945]
However, I get the following error:
Exception: Error when checking model input: expected embedding_input_1 to have shape (None, 347) but got array with shape (1, 13)
I believe my training step is correctly setup, so my model construction must be wrong. Note that 347 is max_length
.
How can I correctly build a variable-length input LSTM in Keras? I'd prefer not to pad the data. Not sure if it's relevant, but I'm using the Theano backend.
The first and simplest way of handling variable length input is to set a special mask value in the dataset, and pad out the length of each input to the standard length with this mask value set for all additional entries created. Then, create a Masking layer in the model, placed ahead of all downstream layers.
That means the input_size of the LSTM needs to be 768. The hidden_size is not dependent on your input, but rather how many features the LSTM should create, which is then used for the hidden state as well as the output, since that is the last hidden state.
The input of LSTM layer has a shape of (num_timesteps, num_features) , therefore: If each input sample has 69 timesteps, where each timestep consists of 1 feature value, then the input shape would be (69, 1) .
I am not clear about the embedding procedure. But still here is a way to implement a variable-length input LSTM. Just do not specify the timespan dimension when building LSTM.
import keras.backend as K from keras.layers import LSTM, Input I = Input(shape=(None, 200)) # unknown timespan, fixed feature size lstm = LSTM(20) f = K.function(inputs=[I], outputs=[lstm(I)]) import numpy as np data1 = np.random.random(size=(1, 100, 200)) # batch_size = 1, timespan = 100 print f([data1])[0].shape # (1, 20) data2 = np.random.random(size=(1, 314, 200)) # batch_size = 1, timespan = 314 print f([data2])[0].shape # (1, 20)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With