Keras LSTM: a time-series multi-step multi-features forecasting - poor results

Tags:

I have a time series dataset containing data from a whole year (date is the index). The data was measured every 15 min (during whole year) which results in 96 timesteps a day. The data is already normalized. The variables are correlated. All the variables except the VAR are weather measures.

VAR is seasonal in a day period and in a week period (as it looks a bit different on weekend, but more less the same every weekend). VAR values are stationary. I would like to predict values of VAR for next two days (192 steps ahead) and for next seven days (672 steps ahead).

Here is the sample of the dataset:

DateIdx               VAR       dewpt       hum         press       temp
2017-04-17 00:00:00   0.369397  0.155039    0.386792    0.196721    0.238889
2017-04-17 00:15:00   0.363214  0.147287    0.429245    0.196721    0.233333
2017-04-17 00:30:00   0.357032  0.139535    0.471698    0.196721    0.227778
2017-04-17 00:45:00   0.323029  0.127907    0.429245    0.204918    0.219444
2017-04-17 01:00:00   0.347759  0.116279    0.386792    0.213115    0.211111
2017-04-17 01:15:00   0.346213  0.127907    0.476415    0.204918    0.169444
2017-04-17 01:30:00   0.259660  0.139535    0.566038    0.196721    0.127778
2017-04-17 01:45:00   0.205564  0.073643    0.523585    0.172131    0.091667
2017-04-17 02:00:00   0.157650  0.007752    0.481132    0.147541    0.055556
2017-04-17 02:15:00   0.122101  0.003876    0.476415    0.122951    0.091667

Input dataset plot

I have decided to use LSTM in Keras. Having data from the whole year, I have used data from past 329 days as a training data and the rest for a validation during the training. train_X -> contains whole measures including VAR from 329 days train_Y -> contains only VAR from 329 days. The value is shifted one step ahead. The rest timesteps goes to test_X and test_Y.

Here is the code I prepare train_X and train_Y:

#X -> is the whole dataframe
#Y -> is a vector of VAR from whole dataframe, already shifted 1 step ahead

#329 * 96 = 31584
train_X = X[:31584]
train_X = train_X.reshape(train_X.shape[0],1,5)
train_Y = Y[:31584]  
train_Y = train_Y.reshape(train_Y.shape[0],1)

To predict next VAR value I would like to use past 672 timesteps (whole week measures). For this reason I have set batch_size=672, so that the ‘fit’ command look like this:

history = model.fit(train_X, train_Y, epochs=50, batch_size=672, validation_data=(test_X, test_Y), shuffle=False)

Here is the architecture of my network:

model = models.Sequential()
model.add(layers.LSTM(672, input_shape=(None, 672), return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(336, return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(168, return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(84, return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(21, return_sequences=False))
model.add(layers.Dense(1))
model.compile(loss='mae', optimizer='adam')
model.summary()

From the plot below we can see that the network has learn ‘something’ after 50 epochs:

Plot from the learning process

For the prediction purpose I have prepared a set of data containing last 672 steps with all values and 96 without VAR value – which should be predicted. I also used autoregression, so I updated VAR after each prediction and used it for next prediction.

The predX dataset (used for prediction) looks like this:

print(predX['VAR'][668:677])

DateIdx            VAR
2017-04-23 23:00:00    0.307573
2017-04-23 23:15:00    0.278207
2017-04-23 23:30:00    0.284390
2017-04-23 23:45:00    0.309118
2017-04-24 00:00:00         NaN
2017-04-24 00:15:00         NaN
2017-04-24 00:30:00         NaN
2017-04-24 00:45:00         NaN
2017-04-24 01:00:00         NaN
Name: VAR, dtype: float64

Here is the code (autoregression) I have used to predict next 96 steps:

stepsAhead = 96
historySteps = 672

for i in range(0,stepsAhead):
    j = i + historySteps
    ypred = model.predict(predX.values[i:j].reshape(1,historySteps,5))
    predX['VAR'][j] = ypred

Unfortunately the results are very poor and very far from the expectations:

A Plot with predicted data

Results combined with a previous day:

Predicted data combined with a previous day

Except from the ‘What have I done wrong‘ question, I would like to ask a few questions:

Q1. During model fifing, I have just put the whole history in batches of 672 size. Is it correct? How should I organize the dataset for the model fitting? What options do I have? Should I use the “sliding window” approach (like in the link here: https://machinelearningmastery.com/promise-recurrent-neural-networks-time-series-forecasting/ )?

Q2. Is the 50 epochs enough? What is the common practice here? Maybe the network is underfitted resulting in poor prediction? So far I tried 200 epoch with the same result.

Q3. Should I try a different architecture? Is the proposed network ‘big enough’ to handle such a data? Maybe a “stateful” network is the right approach here?

Q4. Did I implement the autoregression correctly? Is there any other approach to make a prediction for many steps ahead e.g. 192 or 672 like in my case?

622

asked Jun 07 '18 13:06

Bolesław Maliszewski

2 Answers

It looks like there is a confusion on how to organise the data to train a RNN. So let's cover the questions:

Once you have a 2D dataset (total_samples, 5) you can use the TimeseriesGenerator to create a sliding window what will generate (batch_size, past_timesteps, 5) for you. In this case, you will use .fit_generator to train the network.
If you get the same result, 50 epochs should be fine. You usually adjust based on the performance of your network. But you should keep it fixed if you are comparing two different network architectures.
Architecture is really large as you aim to predict all 672 future values at once. You can design the network so it learns to predict one measurement at a time. At prediction time you can predict one point and feed that again to predict the next until you get 672.
This ties into answer 3, you can learn to predict one at a time and chain the predictions to n number of predictions after training.

The single point prediction model could look like:

model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape=(past_timesteps, 5))
model.add(LSTM(64))
model.add(Dense(1))

152

answered Nov 16 '22 04:11

nuric

1) Batches are not the sequences. The input X is the sequence. The input should have the shape [None, sequence_length, number_of_features]. The 1st axis will be filled in by Keras with the batches. But they are not the sequences. The sequences are on the 2nd axis. The 3rd axis are the feature columns. Batch size 672 might be too large. You can try smaller values 128, 64, or 32.

2) Almost certain your network overfits. The network has too many LSTM layers. I would try just 2 layers of LSTM as @nuric suggested and see how it performs.

3) There also seems a confusion about the LSTM units (or LSTM size). It does not have to be 672. In fact, 672 is too large. A good starting point is 128.

4) The NN architecture is predicting a single value of VAR. In that case, make sure your Y have a single value for each sequence of X.

5) Alternatively you can make the last LSTM to output a sequence. In that case, each Y entry is a VAR sequence shifted one-step ahead. Going back to 4), make sure Y has the correct shape corresponding with that of X and the NN architecture.

6) You plot shows 50 epochs are enough for converging. Once you adjust X, Y, and the NN, do the same thing for watching the number of epochs.

7) Lastly an idea about the dates. If you want to include the dates in X, one idea is to one-hot encode them into week days. So your X would be [dewpt, hum, press, temp, MON, TUE, ..., SAT, SUN].

answered Nov 16 '22 03:11

neurite

Related questions
                            
                                Cookie authentication with Python requests
                            
                                Does PyCharm have autocomplete file path?
                            
                                how to convert a list into a pandas dataframe
                            
                                When reading huge HDF5 file with "pandas.read_hdf() ", why do I still get MemoryError even though I read in chunks by specifying chunksize?
                            
                                Python Invoke - Can't find any collection named 'tasks'!
                            
                                Django model subclassing approaches
                            
                                Changing time components of pandas datetime64 column
                            
                                How to create charts with Plotly on Django?
                            
                                Folium map not displaying
                            
                                Why can't pdb access a variable containing an exception?
                            
                                Running Flask with Gunicorn raises TypeError: index() takes 0 positional arguments but 2 were given
                            
                                Binary to String/Text in Python
                            
                                Heroku Scheduler With Python Script
                            
                                Weird repeated sequence printed to console when installing packages through conda
                            
                                Convert Pandas Dataframe to Float with commas and negative numbers
                            
                                How do I perform an UPDATE of existing rows of a db table using a Pandas DataFrame?
                            
                                How to mock a method return value of a class
                            
                                Python logger per function or per module
                            
                                Is 3-space indentation required in reST?
                            
                                DBSCAN with custom metric

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Keras LSTM: a time-series multi-step multi-features forecasting - poor results

Tags:

python

tensorflow

keras

lstm

time-series

Bolesław Maliszewski

People also ask

2 Answers

nuric

neurite

Recent Activity

Donate For Us