Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple prediction using linear regression with python

data2 = pd.DataFrame(data1['kwh'])
data2
                          kwh
date    
2012-04-12 14:56:50     1.256400
2012-04-12 15:11:55     1.430750
2012-04-12 15:27:01     1.369910
2012-04-12 15:42:06     1.359350
2012-04-12 15:57:10     1.305680
2012-04-12 16:12:10     1.287750
2012-04-12 16:27:14     1.245970
2012-04-12 16:42:19     1.282280
2012-04-12 16:57:24     1.365710
2012-04-12 17:12:28     1.320130
2012-04-12 17:27:33     1.354890
2012-04-12 17:42:37     1.343680
2012-04-12 17:57:41     1.314220
2012-04-12 18:12:44     1.311970
2012-04-12 18:27:46     1.338980
2012-04-12 18:42:51     1.357370
2012-04-12 18:57:54     1.328700
2012-04-12 19:12:58     1.308200
2012-04-12 19:28:01     1.341770
2012-04-12 19:43:04     1.278350
2012-04-12 19:58:07     1.253170
2012-04-12 20:13:10     1.420670
2012-04-12 20:28:15     1.292740
2012-04-12 20:43:15     1.322840
2012-04-12 20:58:18     1.247410
2012-04-12 21:13:20     0.568352
2012-04-12 21:28:22     0.317865
2012-04-12 21:43:24     0.233603
2012-04-12 21:58:27     0.229524
2012-04-12 22:13:29     0.236929
2012-04-12 22:28:34     0.233806
2012-04-12 22:43:38     0.235618
2012-04-12 22:58:43     0.229858
2012-04-12 23:13:43     0.235132
2012-04-12 23:28:46     0.231863
2012-04-12 23:43:55     0.237794
2012-04-12 23:59:00     0.229634
2012-04-13 00:14:02     0.234484
2012-04-13 00:29:05     0.234189
2012-04-13 00:44:09     0.237213
2012-04-13 00:59:09     0.230483
2012-04-13 01:14:10     0.234982
2012-04-13 01:29:11     0.237121
2012-04-13 01:44:16     0.230910
2012-04-13 01:59:22     0.238406
2012-04-13 02:14:21     0.250530
2012-04-13 02:29:24     0.283575
2012-04-13 02:44:24     0.302299
2012-04-13 02:59:25     0.322093
2012-04-13 03:14:30     0.327600
2012-04-13 03:29:31     0.324368
2012-04-13 03:44:31     0.301869
2012-04-13 03:59:42     0.322019
2012-04-13 04:14:43     0.325328
2012-04-13 04:29:43     0.306727
2012-04-13 04:44:46     0.299012
2012-04-13 04:59:47     0.303288
2012-04-13 05:14:48     0.326205
2012-04-13 05:29:49     0.344230
2012-04-13 05:44:50     0.353484
...

65701 rows × 1 columns

I have this dataframe with this index and 1 column.I want to do simple prediction using linear regression with sklearn.I'm very confused and I don't know how to set X and y(I want the x values to be the time and y values kwh...).I'm new to Python so every help is valuable.Thank you.

like image 437
Jimmys Avatar asked Apr 14 '15 08:04

Jimmys


People also ask

What is the class used in Python to create a simple linear regression?

Linear Regression From Scratch In this post, we'll use two Python modules: statsmodels — a module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration.


2 Answers

The first thing you have to do is split your data into two arrays, X and y. Each element of X will be a date, and the corresponding element of y will be the associated kwh.

Once you have that, you will want to use sklearn.linear_model.LinearRegression to do the regression. The documentation is here.

As for every sklearn model, there is two step. First you must fit your data. Then, put the dates of which you want to predict the kwh in another array, X_predict, and predict the kwh using the predict method.

from sklearn.linear_model import LinearRegression

X = []  # put your dates in here
y = []  # put your kwh in here

model = LinearRegression()
model.fit(X, y)

X_predict = []  # put the dates of which you want to predict kwh here
y_predict = model.predict(X_predict)
like image 101
TheWalkingCube Avatar answered Sep 22 '22 14:09

TheWalkingCube


Predict() function takes 2 dimensional array as arguments. So, If u want to predict the value for simple linear regression, then you have to issue the prediction value within 2 dimentional array like,

model.predict([[2012-04-13 05:55:30]]);

If it is a multiple linear regression then,

model.predict([[2012-04-13 05:44:50,0.327433]])

like image 23
mohangraj Avatar answered Sep 20 '22 14:09

mohangraj