I want the resulting array as a binary yes/no. I came up with <pre class="prettyprint"><code> img = PIL.Image.open(filename) array = numpy.array(img) thresholded_array = numpy.copy(array) brightest = numpy.amax(array) threshold = brightest/2 for b in xrange(490): for c in xrange(490): if array[b][c] > threshold: thresholded_array[b][c] = 255 else: thresholded_array[b][c] = 0 out=PIL.Image.fromarray(thresholded_array) </code></pre> but iterating over the array one value at a time is very very slow and I know there must be a faster way, what's the fastest?

Instead of looping, you can compare the entire array at once in several ways. Starting from <pre class="prettyprint"><code>>>> arr = np.random.randint(0, 255, (3,3)) >>> brightest = arr.max() >>> threshold = brightest // 2 >>> arr array([[214, 151, 216], [206, 10, 162], [176, 99, 229]]) >>> brightest 229 >>> threshold 114 </code></pre> Method #1: use <code>np.where</code>: <pre class="prettyprint"><code>>>> np.where(arr > threshold, 255, 0) array([[255, 255, 255], [255, 0, 255], [255, 0, 255]]) </code></pre> Method #2: use boolean indexing to create a new array <pre class="prettyprint"><code>>>> up = arr > threshold >>> new_arr = np.zeros_like(arr) >>> new_arr[up] = 255 </code></pre> Method #3: do the same, but use an arithmetic hack <pre class="prettyprint"><code>>>> (arr > threshold) * 255 array([[255, 255, 255], [255, 0, 255], [255, 0, 255]]) </code></pre> which works because <code>False == 0</code> and <code>True == 1</code>. <hr> For a 1000x1000 array, it looks like the arithmetic hack is fastest for me, but to be honest I'd use <code>np.where</code> because I think it's clearest: <pre class="prettyprint"><code>>>> %timeit np.where(arr > threshold, 255, 0) 100 loops, best of 3: 12.3 ms per loop >>> %timeit up = arr > threshold; new_arr = np.zeros_like(arr); new_arr[up] = 255; 100 loops, best of 3: 14.2 ms per loop >>> %timeit (arr > threshold) * 255 100 loops, best of 3: 6.05 ms per loop </code></pre>

What's the fastest way to threshold a numpy array?

I want the resulting array as a binary yes/no.

I came up with

    img = PIL.Image.open(filename)

    array = numpy.array(img)
    thresholded_array = numpy.copy(array)

    brightest = numpy.amax(array)
    threshold = brightest/2

    for b in xrange(490):
        for c in xrange(490):
            if array[b][c] > threshold:
                thresholded_array[b][c] = 255
            else:
                thresholded_array[b][c] = 0

    out=PIL.Image.fromarray(thresholded_array)

but iterating over the array one value at a time is very very slow and I know there must be a faster way, what's the fastest?

How can I make my NumPy code faster?

By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.

Is appending to NumPy array faster than list?

It's faster to append list first and convert to array than appending NumPy arrays. NumPy automatically converts lists, usually, so I removed the unneeded array() conversions.

Is NumPy indexing fast?

Furthermore, if the index array has the same shape as the original array, the elements corresponding to True will be selected and put in the resulting array. Indexing in NumPy is a reasonably fast operation. Anyway, when speed is critical, you can use the, slightly faster, numpy.

Instead of looping, you can compare the entire array at once in several ways. Starting from

>>> arr = np.random.randint(0, 255, (3,3))
>>> brightest = arr.max()
>>> threshold = brightest // 2
>>> arr
array([[214, 151, 216],
       [206,  10, 162],
       [176,  99, 229]])
>>> brightest
229
>>> threshold
114

Method #1: use np.where:

>>> np.where(arr > threshold, 255, 0)
array([[255, 255, 255],
       [255,   0, 255],
       [255,   0, 255]])

Method #2: use boolean indexing to create a new array

>>> up = arr > threshold
>>> new_arr = np.zeros_like(arr)
>>> new_arr[up] = 255

Method #3: do the same, but use an arithmetic hack

>>> (arr > threshold) * 255
array([[255, 255, 255],
       [255,   0, 255],
       [255,   0, 255]])

which works because False == 0 and True == 1.

For a 1000x1000 array, it looks like the arithmetic hack is fastest for me, but to be honest I'd use np.where because I think it's clearest:

>>> %timeit np.where(arr > threshold, 255, 0)
100 loops, best of 3: 12.3 ms per loop
>>> %timeit up = arr > threshold; new_arr = np.zeros_like(arr); new_arr[up] = 255;
100 loops, best of 3: 14.2 ms per loop
>>> %timeit (arr > threshold) * 255
100 loops, best of 3: 6.05 ms per loop

What's the fastest way to threshold a numpy array?

Tags:

python

arrays

numpy

El Confuso

People also ask

1 Answers

DSM

Recent Activity

Donate For Us

What's the fastest way to threshold a numpy array?

Tags:

python

arrays

numpy

El Confuso

People also ask

1 Answers

DSM

Related questions

Recent Activity

Donate For Us