I´m trying to compute the determinant of a numpy array M, with np.shape(M) = (N, L, L) is something like:
import numpy as np
M = np.random.rand(1000*10*10).reshape(1000, 10, 10)
dm = np.zeros(1000)
for _ in xrange(len(dm)):
dm[_] = np.linalg.det(M[_])
Is there a way without looping? "N" is some orders of magnitude larger than "L". I thought of something like:
np.apply_over_axes(np.linalg.det(M), axis=0)
Is there a faster way doing what I want? I guess the loop overhead is a performance bottleneck because the determinant of small matrices is a relatively cheap operation (or am I wrong?).
A matrix is an array of many numbers. For a square matrix, i.e., a matrix with the same number of rows and columns, one can capture important information about the matrix in a just single number, called the determinant.
A determinant is a value associated to a square array of numbers, that square array being called a square matrix. For example, here are determinants of a general 2 × 2 matrix and a general 3 × 3 matrix.
Hence, It's not possible to find the determinant of a 2 × 3 matrix because it is not a square matrix.
In order to calculate the determinant of an m x n matrix, it is necessary that m = n. In other words, we need a square matrix in order to find a determinant. Notice the vertical bars on the matrix denote the determinant of that matrix. This is similar to the absolute value sign for real numbers.
You need to modify np.linalg.det
to get the speed. The idea is that det()
is a Python function, it does a lot of check first, and call the fortran routine, and does some array calculate to get the result.
Here is the code from numpy:
def slogdet(a):
a = asarray(a)
_assertRank2(a)
_assertSquareness(a)
t, result_t = _commonType(a)
a = _fastCopyAndTranspose(t, a)
a = _to_native_byte_order(a)
n = a.shape[0]
if isComplexType(t):
lapack_routine = lapack_lite.zgetrf
else:
lapack_routine = lapack_lite.dgetrf
pivots = zeros((n,), fortran_int)
results = lapack_routine(n, n, a, n, pivots, 0)
info = results['info']
if (info < 0):
raise TypeError, "Illegal input to Fortran routine"
elif (info > 0):
return (t(0.0), _realType(t)(-Inf))
sign = 1. - 2. * (add.reduce(pivots != arange(1, n + 1)) % 2)
d = diagonal(a)
absd = absolute(d)
sign *= multiply.reduce(d / absd)
log(absd, absd)
logdet = add.reduce(absd, axis=-1)
return sign, logdet
def det(a):
sign, logdet = slogdet(a)
return sign * exp(logdet)
To speedup this function, you can omit the check (it becomes your Responsibility to keep the input right), and collect the fortran results in an array, and do the final calculations for all the small arrays without for loop.
Here is my result:
import numpy as np
from numpy.core import intc
from numpy.linalg import lapack_lite
N = 1000
M = np.random.rand(N*10*10).reshape(N, 10, 10)
def dets(a):
length = a.shape[0]
dm = np.zeros(length)
for i in xrange(length):
dm[i] = np.linalg.det(M[i])
return dm
def dets_fast(a):
m = a.shape[0]
n = a.shape[1]
lapack_routine = lapack_lite.dgetrf
pivots = np.zeros((m, n), intc)
flags = np.arange(1, n + 1).reshape(1, -1)
for i in xrange(m):
tmp = a[i]
lapack_routine(n, n, tmp, n, pivots[i], 0)
sign = 1. - 2. * (np.add.reduce(pivots != flags, axis=1) % 2)
idx = np.arange(n)
d = a[:, idx, idx]
absd = np.absolute(d)
sign *= np.multiply.reduce(d / absd, axis=1)
np.log(absd, absd)
logdet = np.add.reduce(absd, axis=-1)
return sign * np.exp(logdet)
print np.allclose(dets(M), dets_fast(M.copy()))
and the speed is:
timeit dets(M)
10 loops, best of 3: 159 ms per loop
timeit dets_fast(M)
100 loops, best of 3: 10.7 ms per loop
So, by doing this, you can speedup by 15 times. That is a good result without any compiled code.
note: I omit the error check for the fortran routine.
I couldn't get apply_over_axes to work, because this is a 3D array I think. Anyway, profiling the code shows that the program spends little time in the loop,
import cProfile
import pstats
N=10000
M = np.random.rand(N*10*10).reshape(N, 10, 10)
def f(M):
dm = np.zeros(N)
for _ in xrange(len(dm)):
dm[_] = np.linalg.det(M[_])
return dm
cProfile.run('f(M)','foo')
p = pstats.Stats('foo')
res = p.sort_stats('cumulative').print_stats(10)
The result is "0.955 seconds" of runtime, with 0.930s of cumulative time spent in linalg.py:1642(det).
If I perform the same test with 2x2 matrices, I get .844s of total time, and .821s in linalg.py:1642(det), despite the matrices being small. Then, it seems that the det()
function that has a large overhead for small matrices.
Doing it with 30x30, I get a total time 1.198s, and a time in det()
of 1.172.
With 70x70, total is 3.122, and time in det()
is 3.094, with less than 1% in the loop.
It seems that in any case, it is not worth optimizing the python loop.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With