scipy.linalg.eig return complex eigenvalues for covariance matrix?

Question

The eigenvalues of a covariance matrix should be real and non-negative because covariance matrices are symmetric and semi positive definite.

However, take a look at the following experiment with scipy:

>>> a=np.random.random(5) >>> b=np.random.random(5) >>> ab = np.vstack((a,b)).T >>> C=np.cov(ab) >>> eig(C) 7.90174997e-01 +0.00000000e+00j, 2.38344473e-17 +6.15983679e-17j, 2.38344473e-17 -6.15983679e-17j, -1.76100435e-17 +0.00000000e+00j,    5.42658040e-33 +0.00000000e+00j

However, reproducing the above example in Matlab works correctly:

a = [0.6271, 0.4314, 0.3453, 0.8073, 0.9739] b = [0.1924, 0.3680, 0.0568, 0.1831, 0.0176] C=cov([a;b]) eig(C) -0.0000 -0.0000  0.0000  0.0000  0.7902

David Alber · Accepted Answer

You have raised two issues:

The eigenvalues returned by scipy.linalg.eig are not real.
Some of the eigenvalues are negative.

Both of these issues are the result of errors introduced by truncation and rounding errors, which always happen with iterative algorithms using floating-point arithmetic. Note that the Matlab results also produced negative eigenvalues.

Now, for a more interesting aspect of the issue: why is Matlab's result real, whereas SciPy's result has some complex components?

Matlab's eig detects if the input matrix is real symmetric or Hermitian and uses Cholesky factorization when it is. See the description of the chol argument in the eig documentation. This is not done automatically in SciPy.

If you want to use an algorithm that exploits the structure of a real symmetric or Hermitian matrix, use scipy.linalg.eigh. For the example in the question:

>>> eigh(C, eigvals_only=True) array([ -3.73825923e-17,  -1.60154836e-17,   8.11704449e-19,          3.65055777e-17,   7.90175615e-01])

This result is the same as Matlab's, if you round to the same number of digits of precision that Matlab printed.

foglerit · Answer

What you are experiencing is numerical instability due to limitations on floating point precision.

Note that:

(1) MATLAB also returned negative values, but the printing format is set to short and you don't see the full precision of the double stored in memory. Use format long g for printing more decimals

(2) All imaginary parts returned by numpy's linalg.eig are close to the machine precision. Thus you should consider them zero.

scipy.linalg.eig return complex eigenvalues for covariance matrix?

Tags:

user11869

2 Answers

David Alber

foglerit

Recent Activity

Donate For Us

scipy.linalg.eig return complex eigenvalues for covariance matrix?

Tags:

user11869

2 Answers

David Alber

foglerit

Related questions

Recent Activity

Donate For Us