Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determinant of Multidimensional array

I´m trying to compute the determinant of a numpy array M, with np.shape(M) = (N, L, L) is something like:

import numpy as np

M = np.random.rand(1000*10*10).reshape(1000, 10, 10)
dm = np.zeros(1000)
for _ in xrange(len(dm)):
    dm[_] = np.linalg.det(M[_])

Is there a way without looping? "N" is some orders of magnitude larger than "L". I thought of something like:

np.apply_over_axes(np.linalg.det(M), axis=0) 

Is there a faster way doing what I want? I guess the loop overhead is a performance bottleneck because the determinant of small matrices is a relatively cheap operation (or am I wrong?).

like image 800
user1825991 Avatar asked Nov 15 '12 08:11

user1825991


People also ask

What is the determinant of an array?

A matrix is an array of many numbers. For a square matrix, i.e., a matrix with the same number of rows and columns, one can capture important information about the matrix in a just single number, called the determinant.

What is a determinant in multivariable calculus?

A determinant is a value associated to a square array of numbers, that square array being called a square matrix. For example, here are determinants of a general 2 × 2 matrix and a general 3 × 3 matrix.

Can you find determinant of a 2x3 matrix?

Hence, It's not possible to find the determinant of a 2 × 3 matrix because it is not a square matrix.

Can an MXN matrix have a determinant?

In order to calculate the determinant of an m x n matrix, it is necessary that m = n. In other words, we need a square matrix in order to find a determinant. Notice the vertical bars on the matrix denote the determinant of that matrix. This is similar to the absolute value sign for real numbers.


2 Answers

You need to modify np.linalg.det to get the speed. The idea is that det() is a Python function, it does a lot of check first, and call the fortran routine, and does some array calculate to get the result.

Here is the code from numpy:

def slogdet(a):
    a = asarray(a)
    _assertRank2(a)
    _assertSquareness(a)
    t, result_t = _commonType(a)
    a = _fastCopyAndTranspose(t, a)
    a = _to_native_byte_order(a)
    n = a.shape[0]
    if isComplexType(t):
        lapack_routine = lapack_lite.zgetrf
    else:
        lapack_routine = lapack_lite.dgetrf
    pivots = zeros((n,), fortran_int)
    results = lapack_routine(n, n, a, n, pivots, 0)
    info = results['info']
    if (info < 0):
        raise TypeError, "Illegal input to Fortran routine"
    elif (info > 0):
        return (t(0.0), _realType(t)(-Inf))
    sign = 1. - 2. * (add.reduce(pivots != arange(1, n + 1)) % 2)
    d = diagonal(a)
    absd = absolute(d)
    sign *= multiply.reduce(d / absd)
    log(absd, absd)
    logdet = add.reduce(absd, axis=-1)
    return sign, logdet

def det(a):
    sign, logdet = slogdet(a)
    return sign * exp(logdet)

To speedup this function, you can omit the check (it becomes your Responsibility to keep the input right), and collect the fortran results in an array, and do the final calculations for all the small arrays without for loop.

Here is my result:

import numpy as np
from numpy.core import intc
from numpy.linalg import lapack_lite

N = 1000
M = np.random.rand(N*10*10).reshape(N, 10, 10)

def dets(a):
    length = a.shape[0]
    dm = np.zeros(length)
    for i in xrange(length):
        dm[i] = np.linalg.det(M[i])
    return dm

def dets_fast(a):
    m = a.shape[0]
    n = a.shape[1]
    lapack_routine = lapack_lite.dgetrf
    pivots = np.zeros((m, n), intc)
    flags = np.arange(1, n + 1).reshape(1, -1)
    for i in xrange(m):
        tmp = a[i]
        lapack_routine(n, n, tmp, n, pivots[i], 0)
    sign = 1. - 2. * (np.add.reduce(pivots != flags, axis=1) % 2)
    idx = np.arange(n)
    d = a[:, idx, idx]
    absd = np.absolute(d)
    sign *= np.multiply.reduce(d / absd, axis=1)
    np.log(absd, absd)
    logdet = np.add.reduce(absd, axis=-1)
    return sign * np.exp(logdet)

print np.allclose(dets(M), dets_fast(M.copy()))

and the speed is:

timeit dets(M)
10 loops, best of 3: 159 ms per loop

timeit dets_fast(M)
100 loops, best of 3: 10.7 ms per loop

So, by doing this, you can speedup by 15 times. That is a good result without any compiled code.

note: I omit the error check for the fortran routine.

like image 159
HYRY Avatar answered Oct 14 '22 11:10

HYRY


I couldn't get apply_over_axes to work, because this is a 3D array I think. Anyway, profiling the code shows that the program spends little time in the loop,

import cProfile
import pstats
N=10000
M = np.random.rand(N*10*10).reshape(N, 10, 10)
def f(M):
    dm = np.zeros(N)
    for _ in xrange(len(dm)):
        dm[_] = np.linalg.det(M[_])
    return dm
cProfile.run('f(M)','foo')
p = pstats.Stats('foo')
res = p.sort_stats('cumulative').print_stats(10)

The result is "0.955 seconds" of runtime, with 0.930s of cumulative time spent in linalg.py:1642(det).

If I perform the same test with 2x2 matrices, I get .844s of total time, and .821s in linalg.py:1642(det), despite the matrices being small. Then, it seems that the det() function that has a large overhead for small matrices.

Doing it with 30x30, I get a total time 1.198s, and a time in det() of 1.172.

With 70x70, total is 3.122, and time in det() is 3.094, with less than 1% in the loop.

It seems that in any case, it is not worth optimizing the python loop.

like image 39
gg349 Avatar answered Oct 14 '22 11:10

gg349