Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sklearn: how to get coefficients of polynomial features

I know it is possible to obtain the polynomial features as numbers by using: polynomial_features.transform(X). According to the manual, for a degree of two the features are: [1, a, b, a^2, ab, b^2]. But how do I obtain a description of the features for higher orders ? .get_params() does not show any list of features.

like image 629
Moritz Avatar asked Jul 08 '15 11:07

Moritz


People also ask

How do you find the characteristics of a polynomial?

poly = PolynomialFeatures(interaction_only=True) >>> poly. fit_transform(X) array([[ 1., 0., 1., 0.], [ 1., 2., 3., 6.], [ 1., 4., 5., 20.]]) Compute number of output features. Fit to data, then transform it.

What does polynomial features do in Sklearn?

Polynomial features are those features created by raising existing features to an exponent. For example, if a dataset had one input feature X, then a polynomial feature would be the addition of a new feature (column) where values were calculated by squaring the values in X, e.g. X^2.

How do you find the regression of a polynomial?

Polynomial regression is a process of finding a polynomial function that takes the form f( x ) = c0 + c1 x + c2 x2 ⋯ cn xn where n is the degree of the polynomial and c is a set of coefficients.


2 Answers

By the way, there is more appropriate function now: PolynomialFeatures.get_feature_names.

from sklearn.preprocessing import PolynomialFeatures
import pandas as pd
import numpy as np

data = pd.DataFrame.from_dict({
    'x': np.random.randint(low=1, high=10, size=5),
    'y': np.random.randint(low=-1, high=1, size=5),
})

p = PolynomialFeatures(degree=2).fit(data)
print p.get_feature_names(data.columns)

This will output as follows:

['1', 'x', 'y', 'x^2', 'x y', 'y^2']

N.B. For some reason you gotta fit your PolynomialFeatures object before you will be able to use get_feature_names().

If you are Pandas-lover (as I am), you can easily form DataFrame with all new features like this:

features = DataFrame(p.transform(data), columns=p.get_feature_names(data.columns))
print features

Result will look like this:

     1    x    y   x^2  x y  y^2
0  1.0  8.0 -1.0  64.0 -8.0  1.0
1  1.0  9.0 -1.0  81.0 -9.0  1.0
2  1.0  1.0  0.0  1.0   0.0  0.0
3  1.0  6.0  0.0  36.0  0.0  0.0
4  1.0  5.0 -1.0  25.0 -5.0  1.0
like image 103
prez Avatar answered Sep 25 '22 08:09

prez


import numpy as np
from sklearn.preprocessing import PolynomialFeatures

X = np.array([2,3])

poly = PolynomialFeatures(3)
Y = poly.fit_transform(X)
print Y
# prints [[ 1  2  3  4  6  9  8 12 18 27]]
print poly.powers_

This code will print:

[[0 0]
 [1 0]
 [0 1]
 [2 0]
 [1 1]
 [0 2]
 [3 0]
 [2 1]
 [1 2]
 [0 3]]

So if the i'th cell is (x,y), that means that Y[i]=(a**x)*(b**y). For instance, in the code example [2 1] equals to (2**2)*(3**1)=12.

like image 7
omerbp Avatar answered Sep 23 '22 08:09

omerbp