Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

polynomial regression using python

From what I understand polynomial regression is a specific type of regression analysis, which is more complicated than linear regression. Is there a python module which can do this? I have looked in matplotlib, scikit and numpy but can only find linear regression analysis.

And it is possible to work out the correlation coefficient of a non-linear line?

like image 579
astrochris Avatar asked Jul 14 '15 12:07

astrochris


2 Answers

Have you had a look at NumPy's polyfit? See reference.

From their examples:

>>> import numpy as np
>>> x = np.array([0.0, 1.0, 2.0, 3.0,  4.0,  5.0])
>>> y = np.array([0.0, 0.8, 0.9, 0.1, -0.8, -1.0])
>>> z = np.polyfit(x, y, 3)
>>> z
[ 0.08703704 -0.81349206  1.69312169 -0.03968254]
like image 90
adrianus Avatar answered Sep 18 '22 05:09

adrianus


scikit supports linear and polynomial regression.

Check the Generalized Linear Models page at section Polynomial regression: extending linear models with basis functions.

Example:

>>> from sklearn.preprocessing import PolynomialFeatures
>>> import numpy as np
>>> X = np.arange(6).reshape(3, 2)
>>> X
array([[0, 1],
       [2, 3],
       [4, 5]])
>>> poly = PolynomialFeatures(degree=2)
>>> poly.fit_transform(X)
array([[ 1,  0,  1,  0,  0,  1],
       [ 1,  2,  3,  4,  6,  9],
       [ 1,  4,  5, 16, 20, 25]])

The features of X have been transformed from [x_1, x_2] to [1, x_1, x_2, x_1^2, x_1 x_2, x_2^2], and can now be used within any linear model.

This sort of preprocessing can be streamlined with the Pipeline tools. A single object representing a simple polynomial regression can be created and used as follows:

>>> from sklearn.preprocessing import PolynomialFeatures
>>> from sklearn.linear_model import LinearRegression
>>> from sklearn.pipeline import Pipeline
>>> model = Pipeline([('poly', PolynomialFeatures(degree=3)),
...                   ('linear', LinearRegression(fit_intercept=False))])
>>> # fit to an order-3 polynomial data
>>> x = np.arange(5)
>>> y = 3 - 2 * x + x ** 2 - x ** 3
>>> model = model.fit(x[:, np.newaxis], y)
>>> model.named_steps['linear'].coef_
array([ 3., -2.,  1., -1.])

The linear model trained on polynomial features is able to exactly recover the input polynomial coefficients.

In some cases it’s not necessary to include higher powers of any single feature, but only the so-called interaction features that multiply together at most d distinct features. These can be gotten from PolynomialFeatures with the setting interaction_only=True.

like image 35
fferri Avatar answered Sep 19 '22 05:09

fferri