Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Forecasting time series data with PyBrain Neural Networks

Problem

I am trying to use 5 years of consecutive, historical data to forecast values for the following year.

Data Structure

My input data input_04_08 looks like this where the first column is the day of the year (1 to 365) and the second column is the recorded input.

1,2
2,2
3,0
4,0
5,0

My output data output_04_08 looks like this, a single column with the recorded output on that day of the year.

27.6
28.9
0
0
0

I then normalise the values between 0 and 1 so the first sample given to the network would look like

Number of training patterns:  1825
Input and output dimensions:  2 1
First sample (input, target):
[ 0.00273973  0.04      ] [ 0.02185273]

Approach(s)

Feed Forward Network

I have implemented the following code in PyBrain

input_04_08 = numpy.loadtxt('./data/input_04_08.csv', delimiter=',')
input_09 = numpy.loadtxt('./data/input_09.csv', delimiter=',')
output_04_08 = numpy.loadtxt('./data/output_04_08.csv', delimiter=',')
output_09 = numpy.loadtxt('./data/output_09.csv', delimiter=',')

input_04_08 = input_04_08 / input_04_08.max(axis=0)
input_09 = input_09 / input_09.max(axis=0)
output_04_08 = output_04_08 / output_04_08.max(axis=0)
output_09 = output_09 / output_09.max(axis=0)
ds = SupervisedDataSet(2, 1)

for x in range(0, 1825):
    ds.addSample(input_04_08[x], output_04_08[x])

n = FeedForwardNetwork()
inLayer = LinearLayer(2)
hiddenLayer = TanhLayer(25)
outLayer = LinearLayer(1)
n.addInputModule(inLayer)
n.addModule(hiddenLayer)
n.addOutputModule(outLayer)
in_to_hidden = FullConnection(inLayer, hiddenLayer)
hidden_to_out = FullConnection(hiddenLayer, outLayer)
n.addConnection(in_to_hidden)
n.addConnection(hidden_to_out)
n.sortModules()

trainer = BackpropTrainer(n, ds, learningrate=0.01, momentum=0.1)

for epoch in range(0, 100000000): 
    if epoch % 10000000 == 0:
        error = trainer.train()  
        print 'Epoch: ', epoch
        print 'Error: ', error


result = numpy.array([n.activate(x) for x in input_09])

and this gave me the following result with final error of 0.00153840123381

enter image description here

Admittedly, this looks good. However, having read more about LSTM (Long Short-Term Memory) neural networks, and there applicability to time series data, I am trying to build one.

LSTM Network

Below is my code

input_04_08 = numpy.loadtxt('./data/input_04_08.csv', delimiter=',')
input_09 = numpy.loadtxt('./data/input_09.csv', delimiter=',')
output_04_08 = numpy.loadtxt('./data/output_04_08.csv', delimiter=',')
output_09 = numpy.loadtxt('./data/output_09.csv', delimiter=',')

input_04_08 = input_04_08 / input_04_08.max(axis=0)
input_09 = input_09 / input_09.max(axis=0)
output_04_08 = output_04_08 / output_04_08.max(axis=0)
output_09 = output_09 / output_09.max(axis=0)
ds = SequentialDataSet(2, 1)

for x in range(0, 1825):
    ds.newSequence()
    ds.appendLinked(input_04_08[x], output_04_08[x])


fnn = buildNetwork( ds.indim, 25, ds.outdim, hiddenclass=LSTMLayer, bias=True, recurrent=True)
trainer = BackpropTrainer(fnn, ds, learningrate=0.01, momentum=0.1)

for epoch in range(0, 10000000): 
    if epoch % 100000 == 0:
        error = trainer.train()  
        print 'Epoch: ', epoch
        print 'Error: ', error

result = numpy.array([fnn.activate(x) for x in input_09])

This results in a final error of 0.000939719502501, but this time, when I feed the test data, the output plot looks terrible.

enter image description here

Possible Problems

I have looked around here at pretty much all the PyBrain questions, these stood out, but haven't helped me figure things out

  • Training an LSTM neural network to forecast time series in pybrain, python
  • Time Series Prediction via Neural Networks
  • Time series forecasting (eventually with python)

I have read a few blog posts, these helped further my understanding a bit, but obviously not enough

  • Understanding LSTM Networks
  • Demystifying LSTM neural networks

Naturally, I have also gone through the PyBrain docs but couldn't find much to help with the sequential dataset bar here.

Any ideas/tips/direction would be welcome.

like image 279
Philip O'Brien Avatar asked Dec 04 '15 12:12

Philip O'Brien


People also ask

Which neural network is best for time series prediction?

Recurrent Neural Networks are the most popular Deep Learning technique for Time Series Forecasting since they allow to make reliable predictions on time series in many different problems. The main problem with RNNs is that they suffer from the vanishing gradient problem when applied to long sequences.

Which type of neural networks can be used for time series data?

Convolutional Neural Networks (CNNs) Convolutional Neural Networks or CNNs are a type of neural network that was designed to efficiently handle image data. The ability of CNNs to learn and automatically extract features from raw input data can be applied to time series forecasting problems.

Are RNNs good for time series data?

The sequential memory fails when the recurrent neural network uses sequential data recorded over a long time, for example, a time series recorded for many years. The RNN will not remember the information for a long time and therefore lose track of the order of the inputs and how the data points depend on each other.

Which neural network can be used for autoregression for time series data?

AR-Net: A simple autoregressive neural network for time series.


1 Answers

I think what happened here is that you tried to assign hyperparameter values according to some rule-of-thumb which worked for the first case, but didn't for the second.

1) The error estimate that you're looking at is an optimistic prediction error estimate of the training set. The actual prediction error is high, but because you didn't test your model on unseen data there's no way of knowing it. Elements of statistical learning gives a nice description of this phenomenon. I would highly recommend this book. You can get it online for free.

2) To get an estimator with low prediction error you need to perform hyperparameter tuning. E.g. the number of hidden nodes, learning rate and momentum should be varied and tested on the unseen data to know which combination leads to the lowest prediction error. scikit-learn has GridSearchCV and RandomizedSearchCV to do just that, but they only work on sklearn`s estimators. You can roll your own estimator though, which is described in the documentation. Personally, I think that model selection and model evaluation are two different tasks. For the first one you can just run a single GridSearchCV or RandomizedSearchCV and get a set of best hyperparameters for your task. For model evaluation you need to run a more sophisticated analysis, such as nested cross-validation or even repeated nested cross-validation if you want an even more accurate estimate.

3) I don't know much about LSTM networks, but I see that in the first example you assigned 25 hidden nodes, but for LSTM you only provide 5. Maybe it's not enough to learn the pattern. You could also drop the output bias as done in the example.

P.S. I think this question actually belongs to http://stats.stackexchange.com, where you're likely to get a more detailed answer to your problem.

EDIT: I just noticed that you're teaching the model for 10 million epochs! I think it's a lot and probably part of the overfitting problem. I think it's a good idea to implement early stopping, i.e. stop training if some predefined error is achieved.

like image 169
abudis Avatar answered Oct 20 '22 15:10

abudis