Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Principal Component Analysis in MATLAB

I'm implementing PCA using eigenvalue decomposition for sparse data. I know matlab has PCA implemented, but it helps me understand all the technicalities when I write code. I've been following the guidance from here, but I'm getting different results in comparison to built-in function princomp.

Could anybody look at it and point me in the right direction.

Here's the code:

function [mu, Ev, Val ] = pca(data)

% mu - mean image
% Ev - matrix whose columns are the eigenvectors corresponding to the eigen
% values Val 
% Val - eigenvalues

if nargin ~= 1
 error ('usage: [mu,E,Values] = pca_q1(data)');
end

mu = mean(data)';

nimages = size(data,2);

for i = 1:nimages
 data(:,i) = data(:,i)-mu(i);
end

L = data'*data;
[Ev, Vals]  = eig(L);    
[Ev,Vals] = sort(Ev,Vals);

% computing eigenvector of the real covariance matrix
Ev = data * Ev;

Val = diag(Vals);
Vals = Vals / (nimages - 1);

% normalize Ev to unit length
proper = 0;
for i = 1:nimages
 Ev(:,i) = Ev(:,1)/norm(Ev(:,i));
 if Vals(i) < 0.00001
  Ev(:,i) = zeros(size(Ev,1),1);
 else
  proper = proper+1;
 end;
end;

Ev = Ev(:,1:nimages);
like image 721
matcheek Avatar asked Dec 09 '10 19:12

matcheek


People also ask

What does PCA do in Matlab?

Principal component analysis is a quantitatively rigorous method for achieving this simplification. The method generates a new set of variables, called principal components. Each principal component is a linear combination of the original variables.

How do you visualize PCA in Matlab?

Perform PCA on the expression data and plot the result. Select a subset of data points by dragging a box around them. The data points are highlighted and their corresponding labels appear in Selected Data. You can then export the selected data to the workspace by selecting Export.

What is principal component analysis?

Principal component analysis, or PCA, is a statistical procedure that allows you to summarize the information content in large data tables by means of a smaller set of “summary indices” that can be more easily visualized and analyzed.

How do you analyze principal component results?

To interpret each principal components, examine the magnitude and direction of the coefficients for the original variables. The larger the absolute value of the coefficient, the more important the corresponding variable is in calculating the component.


1 Answers

Here's how I would do it:

function [V newX D] = myPCA(X)
    X = bsxfun(@minus, X, mean(X,1));           %# zero-center
    C = (X'*X)./(size(X,1)-1);                  %'# cov(X)

    [V D] = eig(C);
    [D order] = sort(diag(D), 'descend');       %# sort cols high to low
    V = V(:,order);

    newX = X*V(:,1:end);
end

and an example to compare against the PRINCOMP function from the Statistics Toolbox:

load fisheriris

[V newX D] = myPCA(meas);
[PC newData Var] = princomp(meas);

You might also be interested in this related post about performing PCA by SVD.

like image 104
Amro Avatar answered Nov 07 '22 03:11

Amro