Adding exogenous variables to my univariate LSTM model

Question

My data frame is on an hourly basis (index of my df) and I want to predict y.

> df.head()

          Date           y             
    2019-10-03 00:00:00 343   
    2019-10-03 01:00:00 101  
    2019-10-03 02:00:00 70  
    2019-10-03 03:00:00 67  
    2019-10-03 04:00:00 122

I will now import the libraries and train the model:

  from keras.models import Sequential
  from keras.layers import Dense
  from keras.layers import LSTM
  from sklearn.preprocessing import MinMaxScaler
  min_max_scaler = MinMaxScaler()
  prediction_hours = 24
  df_train= df[:len(df)-prediction_hours]
  df_test= df[len(df)-prediction_hours:]
  print(df_train.head())
  print('/////////////////////////////////////////')
  print (df_test.head())
  training_set = df_train.values
  training_set = min_max_scaler.fit_transform(training_set)

  x_train = training_set[0:len(training_set)-1]
  y_train = training_set[1:len(training_set)]
  x_train = np.reshape(x_train, (len(x_train), 1, 1))
  num_units = 2
  activation_function = 'sigmoid'
  optimizer = 'adam'
  loss_function = 'mean_squared_error'
  batch_size = 10
  num_epochs = 100
  regressor = Sequential()
  regressor.add(LSTM(units = num_units, activation = activation_function, input_shape=(None, 1)))
  regressor.add(Dense(units = 1))
  regressor.compile(optimizer = optimizer, loss = loss_function)
  regressor.fit(x_train, y_train, batch_size = batch_size, epochs = num_epochs)

And after training, I can actually use it on my test data:

 test_set = df_test.values
 inputs = np.reshape(test_set, (len(test_set), 1))
 inputs = min_max_scaler.transform(inputs)
 inputs = np.reshape(inputs, (len(inputs), 1, 1))
 predicted_y = regressor.predict(inputs)
 predicted_y = min_max_scaler.inverse_transform(predicted_y)

This is the prediction I got:

The forecast is actually pretty good: is it too good to be true? Am I doing anything wrong? I followed the implementation step by step from a GitHub implementation.

I want to add some exogenous variables, namely v1, v2, v3. If my dataset now looks like this with new variables,

df.head()

          Date           y   v1   v2   v3          
    2019-10-03 00:00:00 343  4     6    10  
    2019-10-03 01:00:00 101  3     2    24
    2019-10-03 02:00:00 70   0     0    50  
    2019-10-03 03:00:00 67   0     4    54
    2019-10-03 04:00:00 122  3     3    23

How can I include these variables v1,v2 and v3 in my LSTM model? The implementation of the multivariate LSTM is very confusing to me.

Edit to answer Yoan suggestion:

For a dataframe with the date as index and with the columns y, v1, v2 and v3, I've done the following as suggested:

  from keras.models import Sequential
  from keras.layers import Dense
  from keras.layers import LSTM
  from sklearn.preprocessing import MinMaxScaler
  min_max_scaler = MinMaxScaler()
  prediction_hours = 24
  df_train= df[:len(df)-prediction_hours]
  df_test= df[len(df)-prediction_hours:]
  print(df_train.head())
  print('/////////////////////////////////////////')
  print (df_test.head())
  training_set = df_train.values
  training_set = min_max_scaler.fit_transform(training_set)

  x_train = np.reshape(x_train, (len(x_train), 1, 4))
  y_train = training_set[0:len(training_set),1] #I've tried with 0:len.. and 
                                                                #for 1:len..
  
  num_units = 2
  activation_function = 'sigmoid'
  optimizer = 'adam'
  loss_function = 'mean_squared_error'
  batch_size = 10
  num_epochs = 100
  regressor = Sequential()
  regressor.add(LSTM(units = num_units, activation = activation_function, 
  input_shape=(None, 1,4)))
  regressor.add(Dense(units = 1))
  regressor.compile(optimizer = optimizer, loss = loss_function)
  regressor.fit(x_train, y_train, batch_size = batch_size, epochs = 
  num_epochs)

But I get the following error:

 only integer scalar arrays can be converted to a scalar index

Akshay Sehgal · Accepted Answer

Combining auxiliary features with sequences

There are multiple ways of handling auxiliary features with LSTMs and all of these are inspired by what your data contains and how you want to model these features. I'll discuss 4 different scenarios and strategies for your reference below with some dummy code.

Scenario 1: If you have simple continuous features, simply pass them into an LSTM!
Scenario 2: If you have multiple label encoded sequences, embed and then encode them separately in LSTMs after which concatenate them for your downstream predictions
If you have a label encoded sequence and some auxiliary features, you can -
- Scenario 3: Append these after embedding them and then pass them into the LSTMs
- Scenario 4: Append them to the output of the LSTM and choose to pass them to another set of LSTMs

Scenario 1:

Let's say you have 4 sequential features and all of those are continuous (not label encoded as in text or categorical). In this case, LSTMs are well equipped to handle these features directly. An LSTM layer expects a shape of (batch, sequence, features) and therefore such a scenario fits nicely without any modifications.

Features --> LSTM --> Process --> Predict

Code

from tensorflow.keras import layers, Model, utils

#Four continuous features
X = np.random.random((100,10,4))
Y = np.random.random((100,))

###Define model###
inp = layers.Input((10,4))

#LSTMs
x = layers.LSTM(8, return_sequences=True)(inp)
x = layers.LSTM(8)(x)
out = layers.Dense(1)(x)

model = Model(inp, out)
utils.plot_model(model, show_layer_names=False, show_shapes=True)

enter image description here

Scenario 2:

Next, let's assume another simple case. You have 2 labels encoded sequences (say text). As one would think, all you would want to do is separately create sequential features by building LSTMs for each of them and then concatenating them at the end before your downstream prediction task.

Sequence --> Embed --> LSTM -->|
                               * --> Append --> Process --> Predict
Sequence --> Embed --> LSTM -->|

Code

from tensorflow.keras import layers, Model, utils

#Two sequential, label encoded features
X = np.random.random((100,10,2))
Y = np.random.random((100,))

###Define model###
inp = layers.Input((10,2))
feature1 = layers.Lambda(lambda x: x[...,0])(inp)
feature2 = layers.Lambda(lambda x: x[...,1])(inp)

#Append embeddings features
x1 = layers.Embedding(1000, 5)(feature1)
x2 = layers.Embedding(1200, 7)(feature2)

#LSTMs
x1 = layers.LSTM(8, return_sequences=True)(x1)
x1 = layers.LSTM(8)(x1)

x2 = layers.LSTM(8, return_sequences=True)(x2)
x2 = layers.LSTM(8)(x2)

#Combine LSTM final states
x = layers.concatenate([x1,x2])
out = layers.Dense(1)(x)

model = Model(inp, out)
utils.plot_model(model, show_layer_names=False, show_shapes=True)

enter image description here

Scenario 3:

Next scenario, let's assume you are working with one feature which is label encoded sequence (say text). Before you pass this feature to LSTMs you will have to encode it into an n dimensional vector it using an embedding layer. This will result in a (batch, sequence, embedding_dim) shaped input for the LSTMs which is no problem at all. Let's say, however, you also have 3 auxiliary features which are continuous (and properly normalized). One simply thing you could do is just append these to the output of the Embedding layer to get a (batch, sequence, embedding_dims+auxiliary) input which the LSTM can handle as well!

Sequence --> Embed ----->|
                         *--> Append --> LSTM -> Process --> Predict
Auxiliary --> Process -->|

Code

from tensorflow.keras import layers, Model, utils

#One sequential, label encoded feature & 3 auxilary features for each timestep
X = np.random.random((100,10,4))
Y = np.random.random((100,))

###Define model###
inp = layers.Input((10,4))
feature1 = layers.Lambda(lambda x: x[...,0])(inp)
feature2 = layers.Lambda(lambda x: x[...,1:4])(inp)

#Append embeddings features
x = layers.Embedding(1000, 5)(feature1)
x = layers.concatenate([x, feature2])

#LSTMs
x = layers.LSTM(8, return_sequences=True)(x)
x = layers.LSTM(8)(x)
out = layers.Dense(1)(x)

model = Model(inp, out)
utils.plot_model(model, show_layer_names=False, show_shapes=True)

enter image description here

In the above example, after the label encoded input is embedded into the 5-dimensional vector, the 3 auxiliary inputs are appended and then the (10,8) dimensional sequence is passed to the LSTMs for doing their magic.

Scenario 4:

Let's say you have the same scenario as above, but you want the sequential features to be richer representations before you append the auxilary inputs. Here you could simply pass the sequential feature to an LSTM and append the auxiliary input to the OUTPUT of the LSTM and then decide to pass it into another LSTM if needed. This will require you to return_sequences=True so that you can get the same length sequence which can be appended to the auxiliary features for that set of time steps.

Sequence --> Embed --> LSTM(seq) -->|
                                    *--> Append --> Process --> Predict
Auxiliary --> Process ------------->|

Code

from tensorflow.keras import layers, Model, utils

#One sequential, label and 3 auxilary continous features
X = np.random.random((100,10,4))
Y = np.random.random((100,))

###Define model###
inp = layers.Input((10,4))
feature1 = layers.Lambda(lambda x: x[...,0])(inp)
feature2 = layers.Lambda(lambda x: x[...,1:4])(inp)
#feature2 = layers.Reshape((-1,1))(feature2)

#Append embeddings features
x = layers.Embedding(1000, 5)(feature1)

#LSTMs
x = layers.LSTM(8, return_sequences=True)(x)
x = layers.concatenate([x, feature2])
x = layers.LSTM(8)(x)

#Combine LSTM final states
out = layers.Dense(1)(x)

model = Model(inp, out)
utils.plot_model(model, show_layer_names=False, show_shapes=True)

enter image description here

There are architectures that add a single feature to the output of an LSTM and encode them again in an LSTM, after which they add the next feature and so on instead of adding all of them together. That is a design choice and will have to be tested for your specific data.

Hope this clarifies your question.

Yoan B. M.Sc · Answer

Keras default implementation of LSTM expect input shape : (batch, sequence, features).

So when reshaping x_train instead of doing :

x_train = np.reshape(x_train, (len(x_train), 1, 1))

You simply have :

x_train = np.reshape(x_train, (len(x_train), 1, num_features))

It's not clear from your post whether you also want to predict this new features (multivariate prediction) or if you still want to predict y only.

In the first case you'll need to modify your Dense layer to account for the new dimension of the target :

regressor.add(Dense(units = num_features))

In the second case you'll need to reshape y_train to take only y

y_train = training_set[1:len(training_set),1] # (assuming Date is not the index)

Finally your LSTM input shape must be updated to accept the new reshaped x_train :

regressor.add(LSTM(units = num_units, activation = activation_function, input_shape=(None, 1, num_features)))

Adding exogenous variables to my univariate LSTM model

Tags:

python

keras

lstm

recurrent-neural-network

forecasting

Numbermind

2 Answers

Combining auxiliary features with sequences

Scenario 1:

Scenario 2:

Scenario 3:

Scenario 4:

Akshay Sehgal

Yoan B. M.Sc

Recent Activity

Donate For Us

Adding exogenous variables to my univariate LSTM model

Tags:

python

keras

lstm

recurrent-neural-network

forecasting

Numbermind

2 Answers

Combining auxiliary features with sequences

Scenario 1:

Scenario 2:

Scenario 3:

Scenario 4:

Akshay Sehgal

Yoan B. M.Sc

Related questions

Recent Activity

Donate For Us