I'm trying to understand how to replicate the poly() function in R using scikit-learn (or other module).
For example, let's say I have a vector in R:
a <- c(1:10)
And I want to generate 3rd degree polynomial:
polynomial <- poly(a, 3)
I get the following:
1 2 3
[1,] -0.49543369 0.52223297 -0.4534252
[2,] -0.38533732 0.17407766 0.1511417
[3,] -0.27524094 -0.08703883 0.3778543
[4,] -0.16514456 -0.26111648 0.3346710
[5,] -0.05504819 -0.34815531 0.1295501
[6,] 0.05504819 -0.34815531 -0.1295501
[7,] 0.16514456 -0.26111648 -0.3346710
[8,] 0.27524094 -0.08703883 -0.3778543
[9,] 0.38533732 0.17407766 -0.1511417
[10,] 0.49543369 0.52223297 0.4534252
I'm relatively new to python and I'm trying understand how to utilize the PolynomiaFeatures
function in sklearn to replicate this. I've spent time time looking at examples at the PolynomialFeatures
documentation but I'm still a bit confused.
Any insight would be greatly appreciated. Thanks!
It turns out that you can replicate the result of R's poly(x,p)
function by performing a QR decomposition of a matrix whose columns are the powers of the input vector x
from the 0th power (all ones) up to the p
th power. The Q matrix, minus the first constant column, gives you the result you want.
So, the following should work:
import numpy as np
def poly(x, p):
x = np.array(x)
X = np.transpose(np.vstack((x**k for k in range(p+1))))
return np.linalg.qr(X)[0][:,1:]
In particular:
In [29]: poly([1,2,3,4,5,6,7,8,9,10], 3)
Out[29]:
array([[-0.49543369, 0.52223297, 0.45342519],
[-0.38533732, 0.17407766, -0.15114173],
[-0.27524094, -0.08703883, -0.37785433],
[-0.16514456, -0.26111648, -0.33467098],
[-0.05504819, -0.34815531, -0.12955006],
[ 0.05504819, -0.34815531, 0.12955006],
[ 0.16514456, -0.26111648, 0.33467098],
[ 0.27524094, -0.08703883, 0.37785433],
[ 0.38533732, 0.17407766, 0.15114173],
[ 0.49543369, 0.52223297, -0.45342519]])
In [30]:
The answer by K. A. Buhr is full and complete.
The R poly function also calculates interactions of different degrees of the members. That's why I was looking for the R poly equivalent.
sklearn.preprocessing.PolynomialFeatures Seems to provide such, you can do the np.linalg.qr(X)[0][:,1:]
step after to get the orthogonal matrix.
Something like this:
import numpy as np
import pprint
import sklearn.preprocessing
PP = pprint.PrettyPrinter(indent=4)
MATRIX = np.array([[ 4, 2],[ 2, 3],[ 7, 4]])
poly = sklearn.preprocessing.PolynomialFeatures(2)
PP.pprint(MATRIX)
X = poly.fit_transform(MATRIX)
PP.pprint(X)
Results in:
array([[4, 2],
[2, 3],
[7, 4]])
array([[ 1., 4., 2., 16., 8., 4.],
[ 1., 2., 3., 4., 6., 9.],
[ 1., 7., 4., 49., 28., 16.]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With