Can't get a positive definite variance matrix when very small eigen values

Q: Why is my covariance matrix not positive definite?

When sample size is small, a sample covariance or correlation matrix may be not positive definite due to mere sampling fluctuation. As most matrices rapidly converge on the population matrix, however, this in itself is unlikely to be a problem.

Q: Is every covariance matrix positive definite?

The covariance matrix is always both symmetric and positive semi- definite.

Q: What does a small eigenvalue mean?

Eigenvalues are the variance of principal components. If the eigen values are very low, that suggests there is little to no variance in the matrix, which means- there are chances of high collinearity in data.

Tags:

r

matrix

statistics

eigenvalue

To run a Canonical correspondence analysis (cca package ade4) I need a positive definite variance matrix. (Which in theory is always the case) but:

matrix(c(2,59,4,7,10,0,7,0,0,0,475,18714,4070,97,298,0,1,0,17,7,4,1,4,18,36),nrow=5)
> a
     [,1] [,2]  [,3] [,4] [,5]
[1,]    2    0   475    0    4
[2,]   59    7 18714    1    1
[3,]    4    0  4070    0    4
[4,]    7    0    97   17   18
[5,]   10    0   298    7   36

> eigen(var(a))
$values
[1]  6.380066e+07  1.973658e+02  3.551492e+01  1.033096e+01
[5] -1.377693e-09

The last eigen value is -1.377693e-09 which is < 0. But the theorical value is > 0.
I can't run the function if one of the eigen value is < 0

I really don't know how to fix this without changing the code of the function cca()

Thanks for help

789

asked May 31 '13 14:05

joel lapalme

1 Answers

You can change the input, just a little bit, to make the matrix positive definite.

If you have the variance matrix, you can truncate the eigenvalues:

correct_variance <- function(V, minimum_eigenvalue = 0) {
  V <- ( V + t(V) ) / 2
  e <- eigen(V)
  e$vectors %*% diag(pmax(minimum_eigenvalue,e$values)) %*% t(e$vectors)
}
v <- correct_variance( var(a) )
eigen(v)$values
# [1] 6.380066e+07 1.973658e+02 3.551492e+01 1.033096e+01 1.326768e-08

Using the singular value decomposition, you can do the same thing directly with a.

truncate_singular_values <- function(a, minimum = 0) { 
  s <- svd(a)
  s$u %*% diag( ifelse( s$d > minimum, s$d, minimum ) ) %*% t(s$v)
}
svd(a)$d
# [1] 1.916001e+04 4.435562e+01 1.196984e+01 8.822299e+00 1.035624e-01
eigen(var( truncate_singular_values(a,.2) ))$values
# [1] 6.380066e+07 1.973680e+02 3.551494e+01 1.033452e+01 6.079487e-09

However, this changes your matrix a by up to 0.1, which is a lot (I suspect it is that high because the matrix a is square: as a result, one of the eigenvalues of var(a) is exactly 0.)

b <- truncate_singular_values(a,.2)
max( abs(b-a) )
# [1] 0.09410187

We can actually do better simply by adding some noise.

b <- a + 1e-6*runif(length(a),-1,1)  # Repeat if needed
eigen(var(b))$values
# [1] 6.380066e+07 1.973658e+02 3.551492e+01 1.033096e+01 2.492604e-09

142

answered Sep 28 '22 10:09

Vincent Zoonekynd

Related questions
                            
                                Creating new named variable in dataframe using loop and naming convention
                            
                                R-project filepath from concatenation
                            
                                Inverse function of an unknown cumulative function
                            
                                freeing up memory in R
                            
                                Understanding 'gslider' function to make interactive plots
                            
                                access plyr id variables within functions
                            
                                R - How to apply different functions to certain rows in a column
                            
                                An "asymmetric" pairwise distance matrix
                            
                                Simulating time series random variable in R?
                            
                                How to use R base plots in grid.newpage?
                            
                                How can I color nodes of a graphNEL graph?
                            
                                Different data in upper and lower panel of scatterplot matrix
                            
                                Store large dataframes in redis through R
                            
                                understanding optimisation messages on assignment by reference in a data.table
                            
                                How to generate regular xts periods from random time observations?
                            
                                Removing Vertical White Lines in ggplot Line Graph
                            
                                C code called by R keeps crashing
                            
                                .SD columns in data.table in R
                            
                                Why lag in r is not working for matrix?
                            
                                Scraping from aspx website using R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With