Feature mapping using multi-variable polynomial

Question

Consider we have a data-matrix of $N$ data points $\mathbf{x} = [x_1, x_2]^T$ and we are interested to map those data points into a higher dimensional feature space. We can do this by using d-degree polynomials. Thus for a sequence of $N$ data points the new data-matrix is

$\begin{bmatrix} x_1 & x_2 & x_1^2 & x_1x_2 & x_2^2 & x_1^3 & x_1^2 x_2 & x_1 x_2^2 & x_2^3 & \dots\\[0.3em] x_1 & x_2 & x_1^2 & x_1x_2 & x_2^2 & x_1^3 & x_1^2 x_2 & x_1 x_2^2 & x_2^3 & \dots\\[0.3em] \vdots & \vdots & & & & & & & & \vdots \\[0.3em] x_1 & x_2 & x_1^2 & x_1x_2 & x_2^2 & x_1^3 & x_1^2 x_2 & x_1 x_2^2 & x_2^3 & \dots\\[0.3em] \end{bmatrix}\in \mathbf{R}^{N \times 2^{d} + 1 }$

I have studied a relevant script (Andrew Ng. online course) that make such a transform for 2-dimensional data points to a higher feature space. However, I could not figure out a way to generalize in arbitrary higher dimensional samples, $\mathbf{x} = [x_1, x_2, \dots, x_D]$ . Here is the code:

d = 6;
m = size(D,1); 
new = ones(m);
for k = 1:d
    for l = 0:k
        new(:, end+1) = (x1.^(k-l)).*(x2.^l);
    end
end

Can we vectorize this code? Also given a data-matrix $\mathbf{D} \in \mathbf{R}^{N \times 2}$ could you please suggest a way on how we can transform data points of arbitrary dimension to a higher one using a d-dimensional polynomial?

PS: A generalization of d-dimensional data points would be very helpful.

Matthew Gunn · Accepted Answer

This solution can handle k variables and generate all the terms of a degree d polynomial where k and d are non-negative integers. Most of the code length is due to the combinatoric complexity of generating all the terms of a degree d polynomial in k variables.

It takes an n_obs by k data matrix X where n_obs is the number of observations and k is the number of variables.

Helper function

This function generates all possible rows such that every entry is a non-negative integer and the row sums to a positive integer:

the row [0, 1, 3, 0, 1]  corresponds to (x1^0)*(x1^1)*(x2^3)*(x4^0)*(x5^1)

The function (which almost certainly could be written more efficiently) is:

function result = mg_sums(n_numbers, d)
if(n_numbers<=1)
    result = d;
else
    result = zeros(0, n_numbers);    
    for(i = d:-1:0)
        rc = mg_sums(n_numbers - 1, d - i);
        result = [result; i * ones(size(rc,1), 1), rc];
    end    
end

Initialization code

n_obs  = 1000;  % number observations
n_vars = 3;     % number of variables
max_degree  = 4;     % order of polynomial

X = rand(n_obs, n_vars);  % generate random, strictly positive data

stacked = zeros(0, n_vars); %this will collect all the coefficients...    
for(d = 1:max_degree)          % for degree 1 polynomial to degree 'order'
    stacked = [stacked; mg_sums(n_vars, d)];
end

Final Step: Method 1

newX = zeros(size(X,1), size(stacked,1));
for(i = 1:size(stacked,1))
    accumulator = ones(n_obs, 1);
    for(j = 1:n_vars)
        accumulator = accumulator .* X(:,j).^stacked(i,j);
    end
    newX(:,i) = accumulator;
end

Use either method 1 or method 2.

Final Step: Method 2 (requires all data in data matrix X is strictly positive (The problem is that if you have 0 elements, the `-inf` doesn't propagate properly when you call the matrix algebra routines.)

newX = real(exp(log(X) * stacked'));  % multiplying log of data matrix by the    
                                % matrix of all possible exponent combinations
                                % effectively raises terms to powers and multiplies them!

Example Run

X = [2, 3, 5];
max_degree = 3;

The stacked matrix and the polynomial term it represents are:

 1     0     0        x1           2
 0     1     0        x2           3 
 0     0     1        x3           5
 2     0     0        x1.^2        4
 1     1     0        x1.*x2       6
 1     0     1        x1.*x3       10
 0     2     0        x2.^2        9
 0     1     1        x2.*x3       15
 0     0     2        x3.^2        25
 3     0     0        x1.^3        8
 2     1     0        x1.^2.*x2    12
 2     0     1        x1.^2.*x3    20
 1     2     0        x1.*x2.^2    18
 1     1     1        x1.*x2.*x3   30
 1     0     2        x1.*x3.^2    50
 0     3     0        x2.^3        27
 0     2     1        x2.^2.*x3    45
 0     1     2        x2.*x3.^2    75
 0     0     3        x3.^3        125

If data matrix X is [2, 3, 5] this correctly generates:

newX = [2, 3, 5, 4, 6, 10, 9, 15, 25, 8, 12, 20, 18, 30, 50, 27, 45, 75, 125];

Where the 1st column is x1, 2nd is x2, 3rd is x3, 4th is x1.^2, 5th is x1.*x2 etc...

Feature mapping using multi-variable polynomial

Tags:

mapping

machine-learning

matlab

octave

linear-regression

Thoth

1 Answers

Helper function

Initialization code

Final Step: Method 1

Final Step: Method 2 (requires all data in data matrix X is strictly positive (The problem is that if you have 0 elements, the `-inf` doesn't propagate properly when you call the matrix algebra routines.)

Example Run

Matthew Gunn

Recent Activity

Donate For Us

Feature mapping using multi-variable polynomial

Tags:

mapping

machine-learning

matlab

octave

linear-regression

Thoth

1 Answers

Helper function

Initialization code

Final Step: Method 1

Final Step: Method 2 (requires all data in data matrix X is strictly positive (The problem is that if you have 0 elements, the -inf doesn't propagate properly when you call the matrix algebra routines.)

Example Run

Matthew Gunn

Related questions

Recent Activity

Donate For Us

Final Step: Method 2 (requires all data in data matrix X is strictly positive (The problem is that if you have 0 elements, the `-inf` doesn't propagate properly when you call the matrix algebra routines.)