Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

More efficient way to invert a matrix knowing it is symmetric and positive semi-definite

I'm inverting covariance matrices with numpy in python. Covariance matrices are symmetric and positive semi-definite.

I wondered if there exists an algorithm optimised for symmetric positive semi-definite matrices, faster than numpy.linalg.inv() (and of course if an implementation of it is readily accessible from python!). I did not manage to find something in numpy.linalg or searching the web.

EDIT:

As observed by @yixuan, positive semi-definite matrices are not in general strictly invertible. I checked that in my case I just got positive definite matrices, so I accepted an answer that works for positive definiteness. Anyway, in the LAPACK low-level routines I found DSY* routines that are optimised for just simmetric/hermitian matrices, although it seems they are missing in scipy (maybe it is just a matter of installed versions).

like image 680
Giacomo Petrillo Avatar asked Nov 20 '16 10:11

Giacomo Petrillo


3 Answers

I tried out @percusse's answer, but when I timed its execution, I found that it was about 33% slower than np.linalg.inv (using a sample of 100K random positive definite 4x4 np.float64 matrices). Here is my implementation:

from scipy.linalg import lapack

def upper_triangular_to_symmetric(ut):
    ut += np.triu(ut, k=1).T

def fast_positive_definite_inverse(m):
    cholesky, info = lapack.dpotrf(m)
    if info != 0:
        raise ValueError('dpotrf failed on input {}'.format(m))
    inv, info = lapack.dpotri(cholesky)
    if info != 0:
        raise ValueError('dpotri failed on input {}'.format(cholesky))
    upper_triangular_to_symmetric(inv)
    return inv

I tried profiling it, and to my surprise, it spends about 82% of its time calling upper_triangular_to_symmetric (which is not the "hard" part)! I think this happens because it's doing floating point addition in order to combine the matrices, instead of a simple copy.

I tried a upper_triangular_to_symmetric implementation that is about 87% faster (see this question and answer):

from scipy.linalg import lapack

inds_cache = {}

def upper_triangular_to_symmetric(ut):
    n = ut.shape[0]
    try:
        inds = inds_cache[n]
    except KeyError:
        inds = np.tri(n, k=-1, dtype=np.bool)
        inds_cache[n] = inds
    ut[inds] = ut.T[inds]


def fast_positive_definite_inverse(m):
    cholesky, info = lapack.dpotrf(m)
    if info != 0:
        raise ValueError('dpotrf failed on input {}'.format(m))
    inv, info = lapack.dpotri(cholesky)
    if info != 0:
        raise ValueError('dpotri failed on input {}'.format(cholesky))
    upper_triangular_to_symmetric(inv)
    return inv

This version is about 68% faster than np.linalg.inv and only spends about 42% of its time calling upper_triangular_to_symmetric.

like image 127
Kerrick Staley Avatar answered Oct 06 '22 01:10

Kerrick Staley


The API doesn't exist yet but you can use the low level LAPACK ?POTRI routine family for it.

The docstring of sp.linalg.lapack.dpotri is as follows

Docstring:     
inv_a,info = dpotri(c,[lower,overwrite_c])

Wrapper for ``dpotri``.

Parameters
----------
c : input rank-2 array('d') with bounds (n,n)

Other Parameters
----------------
overwrite_c : input int, optional
    Default: 0
lower : input int, optional
    Default: 0

Returns
-------
inv_a : rank-2 array('d') with bounds (n,n) and c storage
info : int
Call signature: sp.linalg.lapack.dpotri(*args, **kwargs)

The most important is the info output. If it is zero that means it solved the equation succesfully regardless of positive definiteness. Because this is low-level call no other checks are performed.

>>>> M = np.random.rand(10,10)
>>>> M = M + M.T
>>>> # Make it pos def
>>>> M += (1.5*np.abs(np.min(np.linalg.eigvals(M))) + 1) * np.eye(10)
>>>> zz , _ = sp.linalg.lapack.dpotrf(M, False, False)
>>>> inv_M , info = sp.linalg.lapack.dpotri(zz)
>>>> # lapack only returns the upper or lower triangular part 
>>>> inv_M = np.triu(inv_M) + np.triu(inv_M, k=1).T

Also if you compare the speed

>>>> %timeit sp.linalg.lapack.dpotrf(M)
The slowest run took 17.86 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.15 µs per loop

>>>> %timeit sp.linalg.lapack.dpotri(M)
The slowest run took 24.09 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.08 µs per loop

>>>> III = np.eye(10)

>>>> %timeit sp.linalg.solve(M,III, sym_pos=1,overwrite_b=1)
10000 loops, best of 3: 40.6 µs per loop

So you get a quite nonnegligible speed benefit. If you are working with complex numbers, then you have to use zpotri instead.

The question is whether you need the inverse or not. You probably don't if you need to compute B⁻¹ * A because solve(B,A) is better for that.

like image 36
percusse Avatar answered Oct 06 '22 00:10

percusse


To my knowledge there is not a standard matrix inverse function for symmetric matrices. In general you need more constraints on sparseness etc. to get good speed-ups for your solvers. However, if you look at scipy.linalg you'll see there are some eigenvalue routines that are optimized for Hermitian (symmetric) matrices.

For example, when I generate a random 200x200 dense matrix and solve the eigenvalues I get:

from scipy.linalg import inv,pinvh,eig,eigh
B = np.rand(200,200)
B = B+B.T
%timeit inv(B)
1000 loops, best of 3: 915 µs per loop

%timeit pinvh(B)
100 loops, best of 3: 6.93 ms per loop

So no advantage on the inverse but:

%timeit eig(B)
10 loops, best of 3: 39.1 ms per loop

%timeit eigh(B)
100 loops, best of 3: 4.9 ms per loop

a cool 8x speedup on eigenvalues.

If your matrix is sparse, you should check out scipy.sparse.linalg which has a handful of solvers, some of which (like bicg and cg) require Hermitian matrices and so may be faster. However, it's only sensible if your matrix is sparse, only solves for a particular solution vector b and may not actually be faster, depending on the matrix structure. You'll really have to benchmark it to find out.

I asked a similar question about C++ solvers and ultimately found that it's really application dependent and you have to pick the best solver for your problem.

like image 41
mmdanziger Avatar answered Oct 06 '22 00:10

mmdanziger