I am trying to implement an LSTM with Keras.
I know that LSTM's in Keras require a 3D tensor with shape (nb_samples, timesteps, input_dim)
as an input. However, I am not entirely sure how the input should look like in my case, as I have just one sample of T
observations for each input, not multiple samples, i.e. (nb_samples=1, timesteps=T, input_dim=N)
. Is it better to split each of my inputs into samples of length T/M
? T
is around a few million observations for me, so how long should each sample in that case be, i.e., how would I choose M
?
Also, am I right in that this tensor should look something like:
[[[a_11, a_12, ..., a_1M], [a_21, a_22, ..., a_2M], ..., [a_N1, a_N2, ..., a_NM]], [[b_11, b_12, ..., b_1M], [b_21, b_22, ..., b_2M], ..., [b_N1, b_N2, ..., b_NM]], ..., [[x_11, x_12, ..., a_1M], [x_21, x_22, ..., x_2M], ..., [x_N1, x_N2, ..., x_NM]]]
where M and N defined as before and x corresponds to the last sample that I would have obtained from splitting as discussed above?
Finally, given a pandas dataframe with T
observations in each column, and N
columns, one for each input, how can I create such an input to feed to Keras?
The input of LSTM layer has a shape of (num_timesteps, num_features) , therefore: If each input sample has 69 timesteps, where each timestep consists of 1 feature value, then the input shape would be (69, 1) .
The LSTM network takes a 2D array as input. One layer of LSTM has as many cells as the timesteps. Setting the return_sequences=True makes each cell per timestep emit a signal.
Input shape for LSTM network. You always have to give a three-dimensional array as an input to your LSTM network. Where the first dimension represents the batch size, the second dimension represents the time-steps and the third dimension represents the number of units in one input sequence.
The label data frame contains seven values of a single feature, so its shape is (-1, 7, 1). To configure the neural network correctly, we have to set (30, 1) as the shape of the input layer and use seven neurons in the output layer.
In this article, I am going to show how to prepare a Pandas data frame to use it as an input for a recurrent neural network (for example, LSTM). As an example, I am going to use a data set of Bitcoin prices.
I understand that the input shape for my sequential LSTM model is in the form (samples, time steps (mini-batches), features). So, features would be 10, I can have say 50 as time steps and samples will be 80% of dataframe length.
In this neural network, the input shape is given as (32, ). 32 refers to the number of features in each input sample. Instead of not mentioning the batch-size, even a placeholder can be given.
Below is an example that sets up time series data to train an LSTM. The model output is nonsense as I only set it up to demonstrate how to build the model.
import pandas as pd import numpy as np # Get some time series data df = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/timeseries.csv") df.head()
Time series dataframe:
Date A B C D E F G 0 2008-03-18 24.68 164.93 114.73 26.27 19.21 28.87 63.44 1 2008-03-19 24.18 164.89 114.75 26.22 19.07 27.76 59.98 2 2008-03-20 23.99 164.63 115.04 25.78 19.01 27.04 59.61 3 2008-03-25 24.14 163.92 114.85 27.41 19.61 27.84 59.41 4 2008-03-26 24.44 163.45 114.84 26.86 19.53 28.02 60.09
You can build put inputs into a vector and then use pandas .cumsum()
function to build the sequence for the time series:
# Put your inputs into a single list df['single_input_vector'] = df[input_cols].apply(tuple, axis=1).apply(list) # Double-encapsulate list so that you can sum it in the next step and keep time steps as separate elements df['single_input_vector'] = df.single_input_vector.apply(lambda x: [list(x)]) # Use .cumsum() to include previous row vectors in the current row list of vectors df['cumulative_input_vectors'] = df.single_input_vector.cumsum()
The output can be set up in a similar way, but it will be a single vector instead of a sequence:
# If your output is multi-dimensional, you need to capture those dimensions in one object # If your output is a single dimension, this step may be unnecessary df['output_vector'] = df[output_cols].apply(tuple, axis=1).apply(list)
The input sequences have to be the same length to run them through the model, so you need to pad them to be the max length of your cumulative vectors:
# Pad your sequences so they are the same length from keras.preprocessing.sequence import pad_sequences max_sequence_length = df.cumulative_input_vectors.apply(len).max() # Save it as a list padded_sequences = pad_sequences(df.cumulative_input_vectors.tolist(), max_sequence_length).tolist() df['padded_input_vectors'] = pd.Series(padded_sequences).apply(np.asarray)
Training data can be pulled from the dataframe and put into numpy arrays. Note that the input data that comes out of the dataframe will not make a 3D array. It makes an array of arrays, which is not the same thing.
You can use hstack and reshape to build a 3D input array.
# Extract your training data X_train_init = np.asarray(df.padded_input_vectors) # Use hstack to and reshape to make the inputs a 3d vector X_train = np.hstack(X_train_init).reshape(len(df),max_sequence_length,len(input_cols)) y_train = np.hstack(np.asarray(df.output_vector)).reshape(len(df),len(output_cols))
To prove it:
>>> print(X_train_init.shape) (11,) >>> print(X_train.shape) (11, 11, 6) >>> print(X_train == X_train_init) False
Once you have training data you can define the dimensions of your input layer and output layers.
# Get your input dimensions # Input length is the length for one input sequence (i.e. the number of rows for your sample) # Input dim is the number of dimensions in one input vector (i.e. number of input columns) input_length = X_train.shape[1] input_dim = X_train.shape[2] # Output dimensions is the shape of a single output vector # In this case it's just 1, but it could be more output_dim = len(y_train[0])
Build the model:
from keras.models import Model, Sequential from keras.layers import LSTM, Dense # Build the model model = Sequential() # I arbitrarily picked the output dimensions as 4 model.add(LSTM(4, input_dim = input_dim, input_length = input_length)) # The max output value is > 1 so relu is used as final activation. model.add(Dense(output_dim, activation='relu')) model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['accuracy'])
Finally you can train the model and save the training log as history:
# Set batch_size to 7 to show that it doesn't have to be a factor or multiple of your sample size history = model.fit(X_train, y_train, batch_size=7, nb_epoch=3, verbose = 1)
Output:
Epoch 1/3 11/11 [==============================] - 0s - loss: 3498.5756 - acc: 0.0000e+00 Epoch 2/3 11/11 [==============================] - 0s - loss: 3498.5755 - acc: 0.0000e+00 Epoch 3/3 11/11 [==============================] - 0s - loss: 3498.5757 - acc: 0.0000e+00
That's it. Use model.predict(X)
where X
is the same format (other than the number of samples) as X_train
in order to make predictions from the model.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With