Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Time series forecasting encog 3 java read from CSV

I am developing a system for Time Series Forecasting. I bought the Book of Encog3 for Java, but I need to know hot to submit a CSV file with 3 columns and attempt to predict the second column. The CSV is defined as follows:

Date, DeviceConsumption, TotalPower

I need to load that file into a loader and than specify the column I want to predict (that is DeviceConsumption). The third column is used to provide more information and create a pattern.

In the examples (like sunspot) i see

TemporalMLDataSet result = new TemporalMLDataSet(windowSize,1);
TemporalDataDescription desc = new TemporalDataDescription(new ActivationSIN(),Type.RAW, false, true);

result.addDescription(desc);

but where can I define the column that I want to predict ??

Thank you.

EDIT 2 I made few improvements:

Sorry but I still don't understand. I was able to create 2 TemporalDataDescription as you said. But have I to add both to the same TemporalMLDataSet?

TemporalMLDataSet result = new TemporalMLDataSet(WINDOW_SIZE,1);
TemporalDataDescription desc = new TemporalDataDescription(
TemporalDataDescription.Type.RAW,true,true);
    result.addDescription(desc);
TemporalDataDescription desc2 = new TemporalDataDescription(
TemporalDataDescription.Type.RAW,false,true);
    result.addDescription(desc2);

    for(int year = TRAIN_START;year<TRAIN_END;year++)
    {
        TemporalPoint point = new TemporalPoint(2);
        point.setSequence(year);
        point.setData(0, this.deviceConsumption[year]);
        point.setData(1, this.TotalPower[year]);
        result.getPoints().add(point);

    }
    result.generate();

Is it Correct?

EDIT3 The previous code was correct!

like image 520
vincenzodentamaro Avatar asked Nov 02 '22 23:11

vincenzodentamaro


1 Answers

When using the TemporalMLDataSet you create a TemporalDataDescription object for each of the values that you want in the training set. So for your data set you would have two TemporalDataDescription objects. One for the DeviceConsumption and one for TotalPower. The two booleans at the end allow you to specify the predicted column. You would set DeviceConsumption to input & output, and set TotalPower to just input. The Data column the MLDataSet is not really aware of, you just have to set it to a numerically increasing value, such as how the sunspots example works.

Columns can be marked as input and/or output. Input columns are used to predict, output columns are what you are trying to predict. A single column can (and often is) both input and output. Such is the case with the sunspots example, and such is the case also in the data above.

like image 54
JeffHeaton Avatar answered Nov 10 '22 03:11

JeffHeaton