Ensuring positive definite covariance matrix

The outputs of my neural network act as the entries of a covariance matrix. However, a one to one corresponde between outputs and entries results in not positive definite covariance matrices.

Thus, I read https://www.quora.com/When-carrying-out-the-EM-algorithm-how-do-I-ensure-that-the-covariance-matrix-is-positive-definite-at-all-times-avoiding-rounding-issues and https://en.wikipedia.org/wiki/Cholesky_decomposition, more specificially "When A has real entries, L has real entries as well and the factorization may be written A = LL^T".

Now my outputs corresponds to the entries of the L matrix and then I generate the covariance matrix by multiplying it by its transpose.

However, sometimes I still have an error with a not positive definite matrix. How is this possible?

I found a matrix that produces an error, see

print L.shape
print Sigma.shape

S = Sigma[1,18,:,:] # The matrix that gives the error
L_ = L[1,18,:,:]
print L_
S = np.dot(L_,np.transpose(L_))
print S
chol = np.linalg.cholesky(S)

gives as output:

(3, 20, 2, 2)
(3, 20, 2, 2)
[[ -1.69684255e+00   0.00000000e+00]
 [ -1.50235415e+00   1.73807144e-04]]
[[ 2.87927461  2.54925847]
 [ 2.54925847  2.25706792]]
LinAlgError: Matrix is not positive definite

However, this code with copying the values works fine (but probably not exact the same values because not all decimals are printed)

B = np.array([[-1.69684255e+00, 0.00000000e+00], [-1.50235415e+00, 1.73807144e-04]])
A = np.dot(B,B.T)
chol_A = np.linalg.cholesky(A)

So questions are:

  • Is the method of using Sigma = LL' correct (with ' the transpose)?
  • If yes, why I am getting an error? Could this be due to rounding issues?

Edit: I also computed the eigenvalues

print np.linalg.eigvalsh(S)
[ -7.89378944432428397703915834426880e-08

And for the second case

print np.linalg.eigvalsh(A)
[  1.69341869415973178547574207186699e-08

So there is a slight negative eigenvalue for the first case, which declares the non positive definiteness. But how to solve this?

2 Answers

This looks like a numerical issue, however in general it is not true that LL' will always be positive definite (it will be iff L is invertible). For example take L as a matrix where each column is [1 0 0 0 ... 0] (or even more extreme - take L to be a zero matrix of arbitrary dimensionality), the LL' won't be PD. In general I would recommend doing

S = LL' + eps I

which takes care of both problems (for small eps), and is a 'regularized' covariance estimate. You can even go for "optimal" (under some assumtpions) value of eps by using Ledoit-Wolf estimator.

I suspect that the computation of L*L' is being done with floats in the first case and with doubles in the second. I have tried taking your L as a float matrix, computing L*L' and finding its eigenvalues, and I get the same values you do in the first case, but if I convert L to a matrix of doubles, compute L*L' and find the eigenvalues I get the same values as you do in the second case.

This makes sense, as in the computation of L*L'[1,1] the square of 1.73807144e-04 will, in floats, be negligeable compared to the square of -1.50235415e+00.

If I'm right the solution is to convert L to a matrix of doubles before any computation.

