I'm trying to find the minimum array indices along one dimension of a very large 2D numpy array. I'm finding that this is very slow (already tried speeding it up with bottleneck, which was only a minimal improvement). However, taking the straight minimum appears to be an order of magnitude faster:
import numpy as np import time randvals = np.random.rand(3000,160000) start = time.time() minval = randvals.min(axis=0) print "Took {0:.2f} seconds to compute min".format(time.time()-start) start = time.time() minindex = np.argmin(randvals,axis=0) print "Took {0:.2f} seconds to compute argmin".format(time.time()-start)
On my machine this outputs:
Took 0.83 seconds to compute min Took 9.58 seconds to compute argmin
Is there any reason why argmin is so much slower? Is there any way to speed it up to comparable to min?
By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.
The numpy. argmin() method returns indices of the min element of the array in a particular axis. Return : Array of indices into the array with same shape as array.
NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.
In [1]: import numpy as np In [2]: a = np.random.rand(3000, 16000) In [3]: %timeit a.min(axis=0) 1 loops, best of 3: 421 ms per loop In [4]: %timeit a.argmin(axis=0) 1 loops, best of 3: 1.95 s per loop In [5]: %timeit a.min(axis=1) 1 loops, best of 3: 302 ms per loop In [6]: %timeit a.argmin(axis=1) 1 loops, best of 3: 303 ms per loop In [7]: %timeit a.T.argmin(axis=1) 1 loops, best of 3: 1.78 s per loop In [8]: %timeit np.asfortranarray(a).argmin(axis=0) 1 loops, best of 3: 1.97 s per loop In [9]: b = np.asfortranarray(a) In [10]: %timeit b.argmin(axis=0) 1 loops, best of 3: 329 ms per loop
Maybe min
is smart enough to do its job sequentially over the array (hence with cache locality), and argmin
is jumping around the array (causing a lot of cache misses)?
Anyway, if you're willing to keep randvals
as a Fortran-ordered array from the start, it'll be faster, though copying into Fortran-ordered doesn't help.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With