Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Request for example: Recurrent neural network for predicting next value in a sequence

Can anyone give me a practicale example of a recurrent neural network in (pybrain) python in order to predict the next value of a sequence ? (I've read the pybrain documentation and there is no clear example for it I think.) I also found this question. But I fail to see how it works in a more general case. So therefore I'm asking if anyone here could work out a clear example of how to predict the next value of a sequence in pybrain, with a recurrent neural network.

To give an example.

Say for example we have a sequence of numbers in the range [1,7].

First run (So first example): 1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6  Second run (So second example): 1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6  Third run (So third example): 1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7  and so on. 

Now given for example the start of a new sequence: 1 3 5 7 2 4 6 7 1 3

what is/are the next value(s)

This question might seem lazy, but I think there lacks a good and decent example of how to do this with pybrain.


Additionally: How can this be done if more than 1 feature is present:

Example:

Say for example we have several sequences (each sequence having 2 features) in the range [1,7].

First run (So first example): feature1: 1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6                               feature2: 1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7   Second run (So second example): feature1: 1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6                                 feature2: 1 2 3 7 2 3 4 6 2 3 5 6 7 2 4 7 1 3 3 5 6      Third run (So third example): feature1: 1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7                               feature2: 1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6  and so on. 

Now given for example the start of a new sequences:

                                            feature 1: 1 3 5 7 2 4 6 7 1 3                                              feature 2: 1 2 3 7 2 3 4 6 2 4 

what is/are the next value(s)


Feel free to use your own example as long it is similar to these examples and has some in depth explanation.

like image 711
Olivier_s_j Avatar asked May 30 '13 08:05

Olivier_s_j


People also ask

Can RNN be used for prediction?

RNN is best for all type of sequential data analysis. As in forecasting data changes with time, and as RNN can learn changes in time domain so it could be better solution for prediction.

What is Recurrent Neural Network with example?

A recurrent neural network (RNN) is the type of artificial neural network (ANN) that is used in Apple's Siri and Google's voice search. RNN remembers past inputs due to an internal memory which is useful for predicting stock prices, generating text, transcriptions, and machine translation.

Which of the following is are an example of RNN?

Language Modelling and Generating Text. Machine Translation. Speech Recognition.

Can RNN handle sequential data?

Recurrent neural networks (RNN) are the state of the art algorithm for sequential data and are used by Apple's Siri and Google's voice search. It is the first algorithm that remembers its input, due to an internal memory, which makes it perfectly suited for machine learning problems that involve sequential data.


2 Answers

Issam Laradji's worked for me to predict sequence of sequences, except my version of pybrain required a tuple for the UnserpervisedDataSet object:

from pybrain.tools.shortcuts import buildNetwork from pybrain.supervised.trainers import BackpropTrainer from pybrain.datasets import SupervisedDataSet,UnsupervisedDataSet from pybrain.structure import LinearLayer ds = SupervisedDataSet(21, 21) ds.addSample(map(int,'1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6'.split()),map(int,'1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6'.split())) ds.addSample(map(int,'1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6'.split()),map(int,'1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7'.split())) net = buildNetwork(21, 20, 21, outclass=LinearLayer,bias=True, recurrent=True) trainer = BackpropTrainer(net, ds) trainer.trainEpochs(100) ts = UnsupervisedDataSet(21,) ts.addSample(map(int,'1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7'.split())) [ int(round(i)) for i in net.activateOnDataset(ts)[0]] 

gives:

=> [1, 2, 5, 6, 2, 4, 5, 6, 1, 2, 5, 6, 7, 1, 4, 6, 1, 2, 2, 3, 6]

To predict smaller sequences, just train it up as such, either as sub sequences or as overlapping sequences (overlapping shown here):

from pybrain.tools.shortcuts import buildNetwork from pybrain.supervised.trainers import BackpropTrainer from pybrain.datasets import SupervisedDataSet,UnsupervisedDataSet from pybrain.structure import LinearLayer ds = SupervisedDataSet(10, 11) z = map(int,'1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6 1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6 1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7'.split()) obsLen = 10 predLen = 11 for i in xrange(len(z)):   if i+(obsLen-1)+predLen < len(z):     ds.addSample([z[d] for d in range(i,i+obsLen)],[z[d] for d in range(i+1,i+1+predLen)])  net = buildNetwork(10, 20, 11, outclass=LinearLayer,bias=True, recurrent=True) trainer = BackpropTrainer(net, ds) trainer.trainEpochs(100) ts = UnsupervisedDataSet(10,) ts.addSample(map(int,'1 3 5 7 2 4 6 7 1 3'.split())) [ int(round(i)) for i in net.activateOnDataset(ts)[0]] 

gives:

=> [3, 5, 6, 2, 4, 5, 6, 1, 2, 5, 6]

Not too good...

like image 130
wwwslinger Avatar answered Oct 04 '22 08:10

wwwslinger


These steps are meant to perform what you ask for in the first part of the question.

1) Create a supervised dataset that expects a sample and a target in its arguments,

 ds = SupervisedDataSet(21, 21)  #add samples (this can be done automatically)  ds.addSample(map(int,'1 2 4 6 2 3 4 5 1 3 5 6 7 1 4 7 1 2 3 5 6'.split()),map(int,'1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6'.split()))  ds.addSample(map(int,'1 2 5 6 2 4 4 5 1 2 5 6 7 1 4 6 1 2 3 3 6'.split()),map(int,'1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7'.split())) 

A succeeding sample is the target or label y of its predecessor x. We put the number 21 because each sample has 21 numbers or features.

Please note that for standard notations in the second half of your question it is better to call feature1 and feature2 as sample1 and sample2 for a sequence, and let features denote the numbers in a sample.

2) Create Network, initialize trainer and run for 100 epochs

net = buildNetwork(21, 20, 21, outclass=LinearLayer,bias=True, recurrent=True) trainer = BackpropTrainer(net, ds) trainer.trainEpochs(100) 

Make sure to set the recurrent argument as True

3) Create the test data

ts = UnsupervisedDataSet(21, 21) #add the sample to be predicted ts.addSample(map(int,'1 3 5 7 2 4 6 7 1 3 5 6 7 1 4 6 1 2 2 3 7'.split())) 

We created an unsupervised dataset because of the assumption that we don't have the labels or targets.

4) Predict the test sample using the trained network

net.activateOnDataset(ts) 

This should display the values of the expected fourth run.

For the second case when a sequence can have more than sample, instead of creating a supervised dataset, create a sequential one ds = SequentialDataSet(21,21). Then, everytime you get a new sequence, call ds.newSequence() and add the samples -that you call features- in that sequence using ds.addSample().

Hope this is clear-cut :)

If you wish to have the full code to save the trouble of importing the libraries, please let me know.

like image 29
IssamLaradji Avatar answered Oct 04 '22 09:10

IssamLaradji