Consider we have a data-matrix of data points and we are interested to map those data points into a higher dimensional feature space. We can do this by using d-degree polynomials. Thus for a sequence of data points the new data-matrix is
I have studied a relevant script (Andrew Ng. online course) that make such a transform for 2-dimensional data points to a higher feature space. However, I could not figure out a way to generalize in arbitrary higher dimensional samples, . Here is the code:
d = 6;
m = size(D,1);
new = ones(m);
for k = 1:d
for l = 0:k
new(:, end+1) = (x1.^(k-l)).*(x2.^l);
end
end
Can we vectorize this code? Also given a data-matrix could you please suggest a way on how we can transform data points of arbitrary dimension to a higher one using a d-dimensional polynomial?
PS: A generalization of d-dimensional data points would be very helpful.
This solution can handle k
variables and generate all the terms of a degree d
polynomial where k
and d
are non-negative integers. Most of the code length is due to the combinatoric complexity of generating all the terms of a degree d
polynomial in k
variables.
It takes an n_obs
by k
data matrix X
where n_obs
is the number of observations and k
is the number of variables.
This function generates all possible rows such that every entry is a non-negative integer and the row sums to a positive integer:
the row [0, 1, 3, 0, 1] corresponds to (x1^0)*(x1^1)*(x2^3)*(x4^0)*(x5^1)
The function (which almost certainly could be written more efficiently) is:
function result = mg_sums(n_numbers, d)
if(n_numbers<=1)
result = d;
else
result = zeros(0, n_numbers);
for(i = d:-1:0)
rc = mg_sums(n_numbers - 1, d - i);
result = [result; i * ones(size(rc,1), 1), rc];
end
end
n_obs = 1000; % number observations
n_vars = 3; % number of variables
max_degree = 4; % order of polynomial
X = rand(n_obs, n_vars); % generate random, strictly positive data
stacked = zeros(0, n_vars); %this will collect all the coefficients...
for(d = 1:max_degree) % for degree 1 polynomial to degree 'order'
stacked = [stacked; mg_sums(n_vars, d)];
end
newX = zeros(size(X,1), size(stacked,1));
for(i = 1:size(stacked,1))
accumulator = ones(n_obs, 1);
for(j = 1:n_vars)
accumulator = accumulator .* X(:,j).^stacked(i,j);
end
newX(:,i) = accumulator;
end
Use either method 1 or method 2.
-inf
doesn't propagate properly when you call the matrix algebra routines.)newX = real(exp(log(X) * stacked')); % multiplying log of data matrix by the
% matrix of all possible exponent combinations
% effectively raises terms to powers and multiplies them!
X = [2, 3, 5];
max_degree = 3;
The stacked matrix and the polynomial term it represents are:
1 0 0 x1 2
0 1 0 x2 3
0 0 1 x3 5
2 0 0 x1.^2 4
1 1 0 x1.*x2 6
1 0 1 x1.*x3 10
0 2 0 x2.^2 9
0 1 1 x2.*x3 15
0 0 2 x3.^2 25
3 0 0 x1.^3 8
2 1 0 x1.^2.*x2 12
2 0 1 x1.^2.*x3 20
1 2 0 x1.*x2.^2 18
1 1 1 x1.*x2.*x3 30
1 0 2 x1.*x3.^2 50
0 3 0 x2.^3 27
0 2 1 x2.^2.*x3 45
0 1 2 x2.*x3.^2 75
0 0 3 x3.^3 125
If data matrix X
is [2, 3, 5]
this correctly generates:
newX = [2, 3, 5, 4, 6, 10, 9, 15, 25, 8, 12, 20, 18, 30, 50, 27, 45, 75, 125];
Where the 1st column is x1
, 2nd is x2
, 3rd is x3
, 4th is x1.^2
, 5th is x1.*x2
etc...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With