Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

erratic results for numpy/scipy eigendecompositions

I am finding that scipy.linalg.eig sometimes gives inconsistent results. But not every time.

>>> import numpy as np
>>> import scipy.linalg as lin
>>> modmat=np.random.random((150,150))
>>> modmat=modmat+modmat.T  # the data i am interested in is described by real symmetric matrices
>>> d,v=lin.eig(modmat)
>>> dx=d.copy()
>>> vx=v.copy()
>>> d,v=lin.eig(modmat)
>>> np.all(d==dx)
False
>>> np.all(v==vx)
False
>>> e,w=lin.eigh(modmat)
>>> ex=e.copy()
>>> wx=w.copy()
>>> e,w=lin.eigh(modmat)
>>> np.all(e==ex)
True
>>> e,w=lin.eigh(modmat)
>>> np.all(e==ex)
False

While I am not the greatest linear algebra wizard, I do understand that the eigendecomposition is inherently subject to weird rounding errors, but I don't understand why repeating the computation would result in a different value. But my results and reproducibility are varying.

What exactly is the nature of the problem -- well, sometimes the results are acceptably different, and sometimes they aren't. Here are some examples:

>>> d[1]
(9.8986888573772465+0j)
>>> dx[1]
(9.8986888573772092+0j)

The difference above of ~3e-13 does not seem like an enormously big deal. Instead, the real problem (at least for my present project) is that some of the eigenvalues cannot seem to agree on the proper sign.

>>> np.all(np.sign(d)==np.sign(dx))
False
>>> np.nonzero(np.sign(d)!=np.sign(dx))
(array([ 38,  39,  40,  41,  42,  45,  46,  47,  79,  80,  81,  82,  83,
    84, 109, 112]),)
>>> d[38]
(-6.4011617320002525+0j)
>>> dx[38]
(6.1888785138080209+0j)

Similar code in MATLAB does not seem to have this problem.

like image 207
aestrivex Avatar asked Feb 04 '13 20:02

aestrivex


1 Answers

The eigenvalue decompositions satisfy A V = V Lambda, which is all what is guaranteed --- for instance the order of the eigenvalues is not.

Answer to the second part of your question:

Modern compilers/linear algebra libraries produce/contain code that does different things depending on whether the data is aligned in memory on (e.g.) 16-byte boundaries. This affects rounding error in computations, as floating point operations are done in a different order. Small changes in rounding error can then affect things such as ordering of the eigenvalues if the algorithm (here, LAPACK/xGEEV) does not guarantee numerical stability in this respect.

(If your code is sensitive to things like this, it is incorrect! Running e.g. it on a different platform or different library version would lead to a similar problem.)

The results usually are quasi-deterministic --- for instance you get one of 2 possible results, depending if the array happens to be aligned in memory or not. If you are curious about alignment, check A.__array_interface__['data'][0] % 16.

See http://www.nccs.nasa.gov/images/FloatingPoint_consistency.pdf for more

like image 138
pv. Avatar answered Oct 30 '22 22:10

pv.