Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't get a positive definite variance matrix when very small eigen values

To run a Canonical correspondence analysis (cca package ade4) I need a positive definite variance matrix. (Which in theory is always the case) but:

matrix(c(2,59,4,7,10,0,7,0,0,0,475,18714,4070,97,298,0,1,0,17,7,4,1,4,18,36),nrow=5)
> a
     [,1] [,2]  [,3] [,4] [,5]
[1,]    2    0   475    0    4
[2,]   59    7 18714    1    1
[3,]    4    0  4070    0    4
[4,]    7    0    97   17   18
[5,]   10    0   298    7   36

> eigen(var(a))
$values
[1]  6.380066e+07  1.973658e+02  3.551492e+01  1.033096e+01
[5] -1.377693e-09

The last eigen value is -1.377693e-09 which is < 0. But the theorical value is > 0.
I can't run the function if one of the eigen value is < 0

I really don't know how to fix this without changing the code of the function cca()

Thanks for help

like image 789
joel lapalme Avatar asked May 31 '13 14:05

joel lapalme


People also ask

Why is my covariance matrix not positive definite?

When sample size is small, a sample covariance or correlation matrix may be not positive definite due to mere sampling fluctuation. As most matrices rapidly converge on the population matrix, however, this in itself is unlikely to be a problem.

Is every covariance matrix positive definite?

The covariance matrix is always both symmetric and positive semi- definite.

What does a small eigenvalue mean?

Eigenvalues are the variance of principal components. If the eigen values are very low, that suggests there is little to no variance in the matrix, which means- there are chances of high collinearity in data.


1 Answers

You can change the input, just a little bit, to make the matrix positive definite.

If you have the variance matrix, you can truncate the eigenvalues:

correct_variance <- function(V, minimum_eigenvalue = 0) {
  V <- ( V + t(V) ) / 2
  e <- eigen(V)
  e$vectors %*% diag(pmax(minimum_eigenvalue,e$values)) %*% t(e$vectors)
}
v <- correct_variance( var(a) )
eigen(v)$values
# [1] 6.380066e+07 1.973658e+02 3.551492e+01 1.033096e+01 1.326768e-08

Using the singular value decomposition, you can do the same thing directly with a.

truncate_singular_values <- function(a, minimum = 0) { 
  s <- svd(a)
  s$u %*% diag( ifelse( s$d > minimum, s$d, minimum ) ) %*% t(s$v)
}
svd(a)$d
# [1] 1.916001e+04 4.435562e+01 1.196984e+01 8.822299e+00 1.035624e-01
eigen(var( truncate_singular_values(a,.2) ))$values
# [1] 6.380066e+07 1.973680e+02 3.551494e+01 1.033452e+01 6.079487e-09

However, this changes your matrix a by up to 0.1, which is a lot (I suspect it is that high because the matrix a is square: as a result, one of the eigenvalues of var(a) is exactly 0.)

b <- truncate_singular_values(a,.2)
max( abs(b-a) )
# [1] 0.09410187

We can actually do better simply by adding some noise.

b <- a + 1e-6*runif(length(a),-1,1)  # Repeat if needed
eigen(var(b))$values
# [1] 6.380066e+07 1.973658e+02 3.551492e+01 1.033096e+01 2.492604e-09
like image 142
Vincent Zoonekynd Avatar answered Sep 28 '22 10:09

Vincent Zoonekynd