Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can model not even predict sine

I am trying to generate a learned timeseries with an LSTM RNN using Keras, so I want to predict a datapoint, and feed it back in as input to predict the next one and so on, so that I can actually generate the timeseries (for example given 2000 datapoints, predict the next 2000) I am trying it like this, but the Test score RMSE is 1.28 and the prediction is basically a straight line

# LSTM for international airline passengers problem with regression framing
import numpy
import matplotlib.pyplot as plt
from pandas import read_csv
import math
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
    dataX, dataY = [], []
    for i in range(len(dataset)-look_back-1):
        a = dataset[i:(i+look_back), 0]
        dataX.append(a)
        dataY.append(dataset[i + look_back, 0])
    return numpy.array(dataX), numpy.array(dataY)
# fix random seed for reproducibility
numpy.random.seed(7)

# load the dataset
dataset = np.sin(np.linspace(0,35,10000)).reshape(-1,1)
print(type(dataset))
print(dataset.shape)
dataset = dataset.astype('float32')

# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)

# split into train and test sets
train_size = int(len(dataset) * 0.5)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]

# reshape into X=t and Y=t+1
look_back = 1
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))

# create and fit the LSTM network
model = Sequential()
model.add(LSTM(16, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=10, batch_size=1, verbose=2)

# make predictions
trainPredict = model.predict(trainX)
testPredict = list()
prediction = model.predict(testX[0].reshape(1,1,1))
for i in range(trainX.shape[0]):
    prediction = model.predict(prediction.reshape(1,1,1))
    testPredict.append(prediction)
testPredict = np.array(testPredict).reshape(-1,1)

# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])

# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
# shift train predictions for plotting
trainPredictPlot = numpy.empty_like(dataset)
trainPredictPlot[:, :] = numpy.nan
trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict
# shift test predictions for plotting
testPredictPlot = numpy.empty_like(dataset)
testPredictPlot[:, :] = numpy.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict

# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()

What am I doing wrong?

like image 289
Luca Thiede Avatar asked Oct 18 '22 12:10

Luca Thiede


1 Answers

I see multiple issues with your code. Your value for look_back is 1, which means the LSTM sees only one Sample at a time, which is obviously not sufficient to learn anything about the sequence.

You probably did this so that you can make the final prediction at the end by feeding the prediction from the previous step as the new input. To correct way to make this work is to train with more timesteps and then change to network to a stateful LSTM with a single timestep.

Also, when you do the final prediction you have to show the network more than one ground truth sample. Otherwise the position on the sine is ambigious. (is it going up or down in the next step?)

I slapped together q quick example. Here is how I generated the data:

import numpy as np

numSamples = 1000
numTimesteps = 50
width = np.pi/2.0

def getRandomSine(numSamples = 100, width = np.pi):
    return np.sin(np.linspace(0,width,numSamples) + (np.random.rand()*np.pi*2))

trainX = np.stack([getRandomSine(numSamples = numTimesteps+1) for _ in range(numSamples)])
valX = np.stack([getRandomSine(numSamples = numTimesteps+1) for _ in range(numSamples)])

trainX = trainX.reshape((numSamples,numTimesteps+1,1))
valX = valX.reshape((numSamples,numTimesteps+1,1))

trainY = trainX[:,1:,:]
trainX = trainX[:,:-1,:]

valY = valX[:,1:,:]
valX = valX[:,:-1,:]

Here I trained the model:

import keras
from keras.models import Sequential
from keras import layers


model = Sequential()
model.add(layers.recurrent.LSTM(32,return_sequences=True,input_shape=(numTimesteps, 1)))
model.add(layers.recurrent.LSTM(32,return_sequences=True))
model.add(layers.wrappers.TimeDistributed(layers.Dense(1,input_shape=(1,10))))
model.compile(loss='mean_squared_error',
              optimizer='adam')
model.summary()

model.fit(trainX, trainY, nb_epoch=50, validation_data=(valX, valY), batch_size=32)

And here I changed the trained model to allow the continues prediction:

# serialize the model and get its weights, for quick re-building
config = model.get_config()
weights = model.get_weights()

config[0]['config']['batch_input_shape'] = (1, 1, 1)
config[0]['config']['stateful'] = True
config[1]['config']['stateful'] = True

from keras.models import model_from_config
new_model = Sequential().from_config(config)
new_model.set_weights(weights)

#create test sine
testX = getRandomSine(numSamples = numTimesteps*10, width = width*10)

new_model.reset_states()
testPredictions = []
# burn in
for i in range(numTimesteps):
    prediction = new_model.predict(np.array([[[testX[i]]]]))
    testPredictions.append(prediction[0,0,0])

# prediction
for i in range(numTimesteps, len(testX)):
    prediction = new_model.predict(prediction)
    testPredictions.append(prediction[0,0,0])

# plot result
import matplotlib.pyplot as plt
plt.plot(np.stack([testPredictions,testX]).T)
plt.show()

Here is what the result looks like. The prediction errors add up and very quickly it diverges from the input sine. But it clearly learned the general shape of sines. You can now try to improve on this by trying different layers, activation functions etc. enter image description here

like image 136
sietschie Avatar answered Oct 21 '22 09:10

sietschie