Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating Covariance Matrix in Matlab

I am implementing a PCA algorithm in MATLAB. I see two different approaches to calculating the covariance matrix:

C = sampleMat.' * sampleMat ./ nSamples;

and

C = cov(data);

What is the difference between these two methods?

PS 1: When I use cov(data) is that unnecessary:

meanSample = mean(data,1);
data = data - repmat(data, nSamples, 1);

PS 2:

At first approach should I use nSamples or nSamples - 1?

like image 595
kamaci Avatar asked Dec 04 '12 12:12

kamaci


People also ask

How does Matlab calculate covariance matrix?

C = cov( A ) returns the covariance. If A is a vector of observations, C is the scalar-valued variance. If A is a matrix whose columns represent random variables and whose rows represent observations, C is the covariance matrix with the corresponding column variances along the diagonal.

How does Matlab calculate correlation matrix?

R = corrcoef( A ) returns the matrix of correlation coefficients for A , where the columns of A represent random variables and the rows represent observations. R = corrcoef( A , B ) returns coefficients between two random variables A and B .

How do you convert a covariance matrix to a correlation matrix in Matlab?

R = corrcov( C ) returns the correlation matrix R corresponding to the covariance matrix C . [ R , sigma ] = corrcov( C ) also returns sigma , a vector of standard deviations.


1 Answers

In short: cov mainly just adds convenience to the bare formula.

If you type

edit cov

You'll see a lot of stuff, with these lines all the way at the bottom:

xc = bsxfun(@minus,x,sum(x,1)/m);  % Remove mean    
if flag
    xy = (xc' * xc) / m;
else
    xy = (xc' * xc) / (m-1);  % DEFAULT 
end

which is essentially the same as your first line, save for the subtraction of the column-means.

Read the wiki on sample covariances to see why there is a minus-one in the default path.

Note however that your first line uses normal transpose (.'), whereas the cov-version uses conjugate-transpose ('). This will make the output of cov different in the context of complex-valued data.

Also note that cov is a function call to a non-built in function. That means that there will be a (possibly severe) performance penalty when using cov in a loop; Matlab's JIT compiler cannot accelerate non-built in functions.

like image 72
Rody Oldenhuis Avatar answered Oct 04 '22 12:10

Rody Oldenhuis