I have a numpy array containing 10^8 floats and want to count how many of them are >= a given threshold. Speed is crucial because the operation has to be done on large numbers of such arrays. The contestants so far are <pre class="prettyprint"><code>np.sum(myarray >= thresh) np.size(np.where(np.reshape(myarray,-1) >= thresh)) </code></pre> The answers at Count all values in a matrix greater than a value suggest that np.where() would be faster, but I've found inconsistent timing results. What I mean by this is for some realizations and Boolean conditions np.size(np.where(cond)) is faster than np.sum(cond), but for some it is slower. Specifically, if a large fraction of entries fulfil the condition then np.sum(cond) is significantly faster but if a small fraction (maybe less than a tenth) do then np.size(np.where(cond)) wins. The question breaks down into 2 parts: <ul> <li>Any other suggestions?</li> <li>Does it make sense that the time taken by np.size(np.where(cond)) increases with the number of entries for which cond is true? </li> </ul>

Using cython might be a decent alternative. <pre class="prettyprint"><code>import numpy as np cimport numpy as np cimport cython from cython.parallel import prange DTYPE_f64 = np.float64 ctypedef np.float64_t DTYPE_f64_t @cython.boundscheck(False) @cython.wraparound(False) @cython.nonecheck(False) cdef int count_above_cython(DTYPE_f64_t [:] arr_view, DTYPE_f64_t thresh) nogil: cdef int length, i, total total = 0 length = arr_view.shape[0] for i in prange(length): if arr_view[i] >= thresh: total += 1 return total @cython.boundscheck(False) @cython.wraparound(False) @cython.nonecheck(False) def count_above(np.ndarray arr, DTYPE_f64_t thresh): cdef DTYPE_f64_t [:] arr_view = arr.ravel() cdef int total with nogil: total = count_above_cython(arr_view, thresh) return total </code></pre> Timing of different proposed methods. <pre class="prettyprint"><code>myarr = np.random.random((1000,1000)) thresh = 0.33 In [6]: %timeit count_above(myarr, thresh) 1000 loops, best of 3: 693 µs per loop In [9]: %timeit np.count_nonzero(myarr >= thresh) 100 loops, best of 3: 4.45 ms per loop In [11]: %timeit np.sum(myarr >= thresh) 100 loops, best of 3: 4.86 ms per loop In [12]: %timeit np.size(np.where(np.reshape(myarr,-1) >= thresh)) 10 loops, best of 3: 61.6 ms per loop </code></pre> With a larger array: <pre class="prettyprint"><code>In [13]: myarr = np.random.random(10**8) In [14]: %timeit count_above(myarr, thresh) 10 loops, best of 3: 63.4 ms per loop In [15]: %timeit np.count_nonzero(myarr >= thresh) 1 loops, best of 3: 473 ms per loop In [16]: %timeit np.sum(myarr >= thresh) 1 loops, best of 3: 511 ms per loop In [17]: %timeit np.size(np.where(np.reshape(myarr,-1) >= thresh)) 1 loops, best of 3: 6.07 s per loop </code></pre>

Fastest way to count array values above a threshold in numpy

Tags:

performance

python

arrays

boolean

numpy

I have a numpy array containing 10^8 floats and want to count how many of them are >= a given threshold. Speed is crucial because the operation has to be done on large numbers of such arrays. The contestants so far are

np.sum(myarray >= thresh)

np.size(np.where(np.reshape(myarray,-1) >= thresh))

The answers at Count all values in a matrix greater than a value suggest that np.where() would be faster, but I've found inconsistent timing results. What I mean by this is for some realizations and Boolean conditions np.size(np.where(cond)) is faster than np.sum(cond), but for some it is slower.

Specifically, if a large fraction of entries fulfil the condition then np.sum(cond) is significantly faster but if a small fraction (maybe less than a tenth) do then np.size(np.where(cond)) wins.

The question breaks down into 2 parts:

Any other suggestions?
Does it make sense that the time taken by np.size(np.where(cond)) increases with the number of entries for which cond is true?

998

asked Jan 26 '14 18:01

user1991911

1 Answers

Using cython might be a decent alternative.

import numpy as np
cimport numpy as np
cimport cython
from cython.parallel import prange


DTYPE_f64 = np.float64
ctypedef np.float64_t DTYPE_f64_t


@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
cdef int count_above_cython(DTYPE_f64_t [:] arr_view, DTYPE_f64_t thresh) nogil:

    cdef int length, i, total
    total = 0
    length = arr_view.shape[0]

    for i in prange(length):
        if arr_view[i] >= thresh:
            total += 1

    return total


@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
def count_above(np.ndarray arr, DTYPE_f64_t thresh):

    cdef DTYPE_f64_t [:] arr_view = arr.ravel()
    cdef int total

    with nogil:
       total =  count_above_cython(arr_view, thresh)
    return total

Timing of different proposed methods.

myarr = np.random.random((1000,1000))
thresh = 0.33

In [6]: %timeit count_above(myarr, thresh)
1000 loops, best of 3: 693 µs per loop

In [9]: %timeit np.count_nonzero(myarr >= thresh)
100 loops, best of 3: 4.45 ms per loop

In [11]: %timeit np.sum(myarr >= thresh)
100 loops, best of 3: 4.86 ms per loop

In [12]: %timeit np.size(np.where(np.reshape(myarr,-1) >= thresh))
10 loops, best of 3: 61.6 ms per loop

With a larger array:

In [13]: myarr = np.random.random(10**8)

In [14]: %timeit count_above(myarr, thresh)
10 loops, best of 3: 63.4 ms per loop

In [15]: %timeit np.count_nonzero(myarr >= thresh)
1 loops, best of 3: 473 ms per loop

In [16]: %timeit np.sum(myarr >= thresh)
1 loops, best of 3: 511 ms per loop

In [17]: %timeit np.size(np.where(np.reshape(myarr,-1) >= thresh))
1 loops, best of 3: 6.07 s per loop

187

answered Nov 02 '22 23:11

M4rtini

Related questions
                            
                                Need to convert to lambda function
                            
                                Type hinting class not yet imported
                            
                                How to use peewee limit()?
                            
                                lack of speedup and erroneous results with OpenMP and Cython
                            
                                Correlation matrix in NumPy with NaN's
                            
                                How to validate a model in django rest framework?
                            
                                What is the use of the path /usr/share/pyshared in python?
                            
                                Python module ecdsa errors while running paramiko
                            
                                tweepy (python): rate limit exceeded code 88
                            
                                OpenCV Python Linker Error
                            
                                How to pause and resume consumption gracefully in rabbitmq, pika python
                            
                                Looking for a quick way to speed up my code
                            
                                Python Pandas qcut behavior with # of observations not divisible by # of bins
                            
                                wrap long strings sublimetext python
                            
                                classmethod as constructor and inheritance
                            
                                subprocess kills child processes but not the processes the child spawns
                            
                                Launch Openstack Instances using python-boto
                            
                                What is the best way to capture output from a process using python?
                            
                                Python PySerial with Auto RTS through Half-Duplex RS-485 breakout board using Beaglebone Black Angstrom
                            
                                How to pass items to loop using Jinja2/Flask?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With