How do I create a variable-length input LSTM in Keras?

Tags:

I am trying to do some vanilla pattern recognition with an LSTM using Keras to predict the next element in a sequence.

My data look like this:

My data

where the label of the training sequence is the last element in the list: X_train['Sequence'][n][-1].

Because my Sequence column can have a variable number of elements in the sequence, I believe an RNN to be the best model to use. Below is my attempt to build an LSTM in Keras:

# Build the model  # A few arbitrary constants... max_features = 20000 out_size = 128  # The max length should be the length of the longest sequence (minus one to account for the label) max_length = X_train['Sequence'].apply(len).max() - 1  # Normal LSTM model construction with sigmoid activation model = Sequential() model.add(Embedding(max_features, out_size, input_length=max_length, dropout=0.2)) model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) model.add(Dense(1)) model.add(Activation('sigmoid'))  # try using different optimizers and different optimizer configs model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

And here's how I attempt to train my model:

# Train the model for seq in X_train['Sequence']:     print("Length of training is {0}".format(len(seq[:-1])))     print("Training set is {0}".format(seq[:-1]))     model.fit(np.array([seq[:-1]]), [seq[-1]])

My output is this:

Length of training is 13 Training set is [1, 3, 13, 87, 1053, 28576, 2141733, 508147108, 402135275365, 1073376057490373, 9700385489355970183, 298434346895322960005291, 31479360095907908092817694945]

However, I get the following error:

Exception: Error when checking model input: expected embedding_input_1 to have shape (None, 347) but got array with shape (1, 13)

I believe my training step is correctly setup, so my model construction must be wrong. Note that 347 is max_length.

How can I correctly build a variable-length input LSTM in Keras? I'd prefer not to pad the data. Not sure if it's relevant, but I'm using the Theano backend.

380

asked Jul 04 '16 16:07

erip

Video Answer

1 Answers

I am not clear about the embedding procedure. But still here is a way to implement a variable-length input LSTM. Just do not specify the timespan dimension when building LSTM.

import keras.backend as K from keras.layers import LSTM, Input  I = Input(shape=(None, 200)) # unknown timespan, fixed feature size lstm = LSTM(20) f = K.function(inputs=[I], outputs=[lstm(I)])  import numpy as np data1 = np.random.random(size=(1, 100, 200)) # batch_size = 1, timespan = 100 print f([data1])[0].shape # (1, 20)  data2 = np.random.random(size=(1, 314, 200)) # batch_size = 1, timespan = 314 print f([data2])[0].shape # (1, 20)

126

answered Sep 21 '22 21:09

Van

Related questions
                            
                                python 3 try-except all with error [duplicate]
                            
                                Python pandas: how to remove nan and -inf values
                            
                                Need a fast way to count and sum an iterable in a single pass
                            
                                What does the -> (dash-greater-than arrow symbol) mean in a Python method signature? [duplicate]
                            
                                What does Python optimization (-O or PYTHONOPTIMIZE) do?
                            
                                What makes a user-defined class unhashable?
                            
                                Does asyncio supports asynchronous I/O for file operations?
                            
                                Is there an idiomatic way to add an extension using Python's Pathlib?
                            
                                Unsupported operand type(s) for +: 'int' and 'str' [duplicate]
                            
                                UsageError: Line magic function `%` not found. Jupyter Notebook
                            
                                How can I find the union on a list of sets in Python? [duplicate]
                            
                                How do I change the format of a Python log message on a per-logger basis?
                            
                                str.isdecimal() and str.isdigit() difference example
                            
                                @asyncio.coroutine vs async def
                            
                                Python's "open()" throws different errors for "file not found" - how to handle both exceptions?
                            
                                Click will abort further execution because Python 3 was configured to use ASCII as encoding for the environment
                            
                                "Could not interpret optimizer identifier" error in Keras
                            
                                numpy is already installed with Anaconda but I get an ImportError (DLL load failed: The specified module could not be found)
                            
                                Python; urllib error: AttributeError: 'bytes' object has no attribute 'read'
                            
                                Get defining class of unbound method object in Python 3

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do I create a variable-length input LSTM in Keras?

Tags:

python-3.x

variable-length

keras

lstm

recurrent-neural-network

erip

People also ask

Video Answer

1 Answers

Van

Recent Activity

Donate For Us