Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Many to many sequence prediction with different sequence length

My problem is to predict a sequence of values (t_0, t_1, ... t_{n_post-1}) given the previous timesteps (t_{-n_pre}, t_{-n_pre+1} ... t_{-1}) with Keras' LSTM layer.

Keras supports the the following two cases well:

  • n_post == 1 (many to one forecast)
  • n_post == n_pre (many to many forecast with equal sequence lengths)

But not the version where n_post < n_pre.

To illustrate what I need, I built a simple toy example using a sine wave.

Many to one model forecast

With the following model:

model = Sequential()  
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=False))  
model.add(Dense(1))
model.add(Activation('linear'))   
model.compile(loss='mean_squared_error', optimizer='rmsprop') 

predictions look like this: Sine: Many to one LSTM

Many to many model forecast with n_pre == n_post

The network learns to fit a sine wave with n_pre == n_post pretty well with a model like this:

model = Sequential()  
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=True))  
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear'))   
model.compile(loss='mean_squared_error', optimizer='rmsprop')  

Sine Many to Many model forecast using n_post==n_pre

Many to many model forecast with n_post < n_pre

But now, assume my data looks like this: dataX or input: (nb_samples, nb_timesteps, nb_features) -> (1000, 50, 1) dataY or output: (nb_samples, nb_timesteps, nb_features) -> (1000, 10, 1)

After some research I found a way on how to handle these input sizes in Keras, using a model like this:

model = Sequential()  
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=False))  
model.add(RepeatVector(10))
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear'))   
model.compile(loss='mean_squared_error', optimizer='rmsprop') 

But the predictions are really bad: Many to many with n_post < n_pre

Now my questions are:

  • How can I build a model with n_post < n_pre that doesn't lose information because it has a return_sequences=False?
  • Using n_post == n_pre and then cropping the output (after training) doesn't work for me because it would still try to fit on a lot of timesteps while only the first few can be predicted with a neural network (the others are not nicely correlated and would distort the result)
like image 457
Isa Avatar asked Mar 30 '17 12:03

Isa


1 Answers

After asking this question on the Keras Github page, I got an answer, which I post here for completeness.

The solution is to use a second LSTM layer, after shaping the output with RepeatVector to the desired number of output steps.

model = Sequential()  
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=False))  
model.add(RepeatVector(10))
model.add(LSTM(output_dim=hidden_neurons, return_sequences=True))  
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear'))   
model.compile(loss='mean_squared_error', optimizer='rmsprop')  

The predictions are looking better now and look like this: enter image description here

like image 149
Isa Avatar answered Nov 16 '22 02:11

Isa