Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multivariable/Multiple Linear Regression in Scikit Learn?

I have a dataset (dataTrain.csv & dataTest.csv) in .csv file with this format:

Temperature(K),Pressure(ATM),CompressibilityFactor(Z)
273.1,24.675,0.806677258
313.1,24.675,0.888394713
...,...,...

And able to build a regression model and prediction with this code:

import pandas as pd
from sklearn import linear_model

dataTrain = pd.read_csv("dataTrain.csv")
dataTest = pd.read_csv("dataTest.csv")
# print df.head()

x_train = dataTrain['Temperature(K)'].reshape(-1,1)
y_train = dataTrain['CompressibilityFactor(Z)']

x_test = dataTest['Temperature(K)'].reshape(-1,1)
y_test = dataTest['CompressibilityFactor(Z)']

ols = linear_model.LinearRegression()
model = ols.fit(x_train, y_train)

print model.predict(x_test)[0:5]

However, what I want to do is multivariable regression. So, the model will be CompressibilityFactor(Z) = intercept + coef*Temperature(K) + coef*Pressure(ATM)

How to do that in scikit-learn?

like image 285
Drizzer Silverberg Avatar asked Feb 05 '17 18:02

Drizzer Silverberg


People also ask

What is multivariate multiple regression?

Multivariate Multiple Regression is the method of modeling multiple responses, or dependent variables, with a single set of predictor variables. For example, we might want to model both math and reading SAT scores as a function of gender, race, parent income, and so forth.

Is multivariable and multiple regression the same?

There ain't no difference between multiple regression and multivariate regression in that, they both constitute a system with 2 or more independent variables and 1 or more dependent variables.

What is a multivariable linear regression model?

What is multiple linear regression? Multiple linear regression is a regression model that estimates the relationship between a quantitative dependent variable and two or more independent variables using a straight line.


1 Answers

If your code above works for univariate, try this

import pandas as pd
from sklearn import linear_model

dataTrain = pd.read_csv("dataTrain.csv")
dataTest = pd.read_csv("dataTest.csv")
# print df.head()

x_train = dataTrain[['Temperature(K)', 'Pressure(ATM)']].to_numpy().reshape(-1,2)
y_train = dataTrain['CompressibilityFactor(Z)']

x_test = dataTest[['Temperature(K)', 'Pressure(ATM)']].to_numpy().reshape(-1,2)
y_test = dataTest['CompressibilityFactor(Z)']

ols = linear_model.LinearRegression()
model = ols.fit(x_train, y_train)

print model.predict(x_test)[0:5]
like image 103
piRSquared Avatar answered Oct 14 '22 05:10

piRSquared