How to use deep learning models for time-series forecasting?

Question

I have signals recorded from machines (m1, m2, so on) for 28 days. (Note: each signal in each day is 360 length long).

machine_num, day1, day2, ..., day28
m1, [12, 10, 5, 6, ...], [78, 85, 32, 12, ...], ..., [12, 12, 12, 12, ...]
m2, [2, 0, 5, 6, ...], [8, 5, 32, 12, ...], ..., [1, 1, 12, 12, ...]
...
m2000, [1, 1, 5, 6, ...], [79, 86, 3, 1, ...], ..., [1, 1, 12, 12, ...]

I want to predict the signal sequence of each machine for next 3 days. i.e. in day29, day30, day31. However, I don't have values for days 29, 30 and 31. So, my plan was as follows using LSTM model.

The first step is to get signals for day 1 and asked to predict signals for day 2, then in the next step get signals for days 1, 2 and asked to predict signals for day 3, etc, so when I reach day 28, the network has all the signals up to 28 and is asked to predict the signals for day 29, etc.

I tried to do a univariant LSTM model as follows.

# univariate lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
# define dataset
X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])
y = array([40, 50, 60, 70])
# reshape from [samples, timesteps] into [samples, timesteps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(3, 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=1000, verbose=0)
# demonstrate prediction
x_input = array([50, 60, 70])
x_input = x_input.reshape((1, 3, 1))
yhat = model.predict(x_input, verbose=0)
print(yhat)

However, this example is very simple since it does not have long sequences like mine. For example, my data for m1 would look as follows.

m1 = [[12, 10, 5, 6, ...], [78, 85, 32, 12, ...], ..., [12, 12, 12, 12, ...]]

Moreover, I need the prediction of day 29, 30, 31. In that case, I am unsure how to change this example to cater my needs. I want to sepcifically know if the direction I have chosen is correct. If so, how to do it.

I am happy to provide more details if needed.

EDIT:

I have mentioned the model.summary().

enter image description here

Daniel Möller · Accepted Answer

Model and shapes

Since these are sequences in sequences, you need to use your data in a different format.

Although you could just go like (machines, days, 360) and simply treat the 360 as features (that could work up to some point), for a robust model (then maybe there is a speed problem) you'd need to treat both things as sequences.

Then I'd go with data like (machines, days, 360, 1) and two levels of recurrency.

Our models input_shape then would be (None, 360, 1)

Model case 1 - Only day recurrency

Data shape: (machines, days, 360)
Apply some normalization to the data.

Here, an example, but models can be flexible as you can add more layers, try convolutions, etc:

inputs = Input((None, 360)) #(m, d, 360)
outs = LSTM(some_units, return_sequences=False, 
            stateful=depends_on_training_approach)(inputs)  #(m, some_units)
outs = Dense(360, activation=depends_on_your_normalization)(outs) #(m, 360)
outs = Reshape((1,360)) #(m, 1, 360) 
    #this reshape is not necessary if using the "shifted" approach - see time windows below
    #it would then be (m, d, 360)

model = Model(inputs, outs)

Depending on the complexity of the intra-daily sequences, they could get well predicted with this, but if they evolve in a complex way, then the next model would be a little better.

Always remember that you can create more layers and explore things to increase the capability of this model, this is only a tiny example

Model case 2 - Two level recurrency

Data shape: (machines, days, 360, 1)
Apply some normalization to the data.

There are so many many ways to experiment on how to do this, but here is a simple one.

inputs = Input((None, 360, 1)) #(m, d, 360, 1)

#branch 1
inner_average = TimeDistributed(
                    Bidirectional(
                        LSTM(units1, return_sequences=True, stateful=False),
                        merge_mode='ave'
                    )
                )(inputs) #(m, d, 360, units1)
inner_average = Lambda(lambda x: K.mean(x, axis=1))(inner_average) #(m, 360, units1)


#branch 2
inner_seq = TimeDistributed(
                LSTM(some_units, return_sequences=False, stateful=False)
            )(inputs) #may be Bidirectional too
            #shape (m, d, some_units)

outer_seq = LSTM(other_units, return_sequences = False, 
                 stateful=depends_on_training_approach)(inner_seq) #(m, other_units)

outer_seq = Dense(few_units * 360, activation = 'tanh')(outer_seq) #(m, few_units * 360)
    #activation = same as inner_average 


outer_seq = Reshape((360,few_units))(outer_seq) #(m, 360, few_units)


#join branches

outputs = Concatenate()([inner_average, outer_seq]) #(m, 360, units1+few_units)
outputs = LSTM(units, return_sequences=True, stateful= False)(outputs) #(m, 360,units)
outputs = Dense(1, activation=depends_on_your_normalization)(outputs) #(m, 360, 1)
outputs = Reshape((1,360))(outputs) #(m, 1, 360) for training purposes

model = Model(inputs, outputs)

This is one attempt, I made an average of the days, but I could have made, instead of inner_average, something like:

#branch 1
daily_minutes = Permute((2,1,3))(inputs) #(m, 360, d, 1)
daily_minutes = TimeDistributed(
                    LSTM(units1, return_sequences=False, 
                         stateful=depends_on_training_approach)
                )(daily_minutes) #(m, 360, units1)

Many other ways of exploring the data are possible, this is a highly creative field. You could, for instance, use the daily_minutes approach right after the inner_average excluding the K.mean lambda layer.... you got the idea.

Time windows approach

Your approach sounds nice. Give one step to predict the next, give two steps to predic the third, give three steps to predict the fourth.

The models above are suited to this approach.

Keep in mind that very short inputs may be useless and may make your model worse. (Try to imagine how many steps would be reasonably enough for you to start predicting the next ones)

Preprocess your data and divide it in groups:

group with length = 4 (for instance)
group with length = 5
...
group with length = 28

You will need a manual training loop where in each epoch you feed each of these groups (you can't feed different lenghts all together).

Another approach is, give all steps, make the model predict a shifted sequence like:

inputs = original_inputs[:, :-1] #exclude last training day
outputs = original_inputs[:, 1:] #exclude first training day

For making the models above suited to this approach, you need return_sequences=True in every LSTM that uses the day dimension as steps (not the inner_seq). (The inner_average method will fail, and you will have to resort to the daily_minutes approach with return_sequences=True and another Permute((2,1,3)) right after.

Shapes would be:

branch1 : (m, d, 360, units1)
branch2 : (m, d, 360, few_units) - needs to adjust the Reshape for this
- The reshapes using 1 timestep will be unnecessary, the days dimension will replace the 1.
- You may need to use Lambda layers to reshape considering the batch size and variable number of days (if details are needed, please tell me)

Training and predicting

(Sorry for not having the time for detailing it now)

You then can follow the approaches mentioned here and here too, more complete with a few links. (Take care with the output shapes, though, in your question, we are always keeping the time step dimension, even though it may be 1)

The important points are:

If you opt for stateful=False:
- this means easy training with fit (as long as you didn't use the "different lengths" approach);
- this also means you will need to build a new model with stateful=True, copy the weights of the trained model;
- then you do the manual step by step prediction
If you opt for stateful=True from the beginning:
- this necessarily means manual training loop (using train_on_batch for instance);
- this necessarily means you will need model.reset_states() whenever you are going to present a batch whose sequences are not sequels of the last batch (every batch if your batches contain whole sequences).
- don't need to build a new model to manually predict, but manual prediction remains the same

Filipe Lauar · Answer

I think that you are going to a good direction, to increase the time steps in each day, you will need to add a pad in your data, this example can help you: https://github.com/keras-team/keras/blob/master/examples/imdb_lstm.py#L46.

However, I would also try another approachs, like fix the number of time steps, for example, 3 days, 4, 5... And then, evaluating your train, you can choose how many time steps is the best for your model.

Maybe your initial approach increasing the number of days will be better, but in this type of problem, find the best number of time steps in a LSTM is very important.

How to use deep learning models for time-series forecasting?

Tags:

python

deep-learning

lstm

time-series

forecasting

EDIT:

EmJ

2 Answers

Model and shapes

Model case 1 - Only day recurrency

Model case 2 - Two level recurrency

Time windows approach

Training and predicting

Daniel Möller

Filipe Lauar

Recent Activity

Donate For Us

How to use deep learning models for time-series forecasting?

Tags:

python

deep-learning

lstm

time-series

forecasting

EDIT:

EmJ

2 Answers

Model and shapes

Model case 1 - Only day recurrency

Model case 2 - Two level recurrency

Time windows approach

Training and predicting

Daniel Möller

Filipe Lauar

Related questions

Recent Activity

Donate For Us