Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make Numpy.where() return only the first match?

Tags:

python

numpy

I'm trying to optimize the performance of a script, which is full of Numpy's where() after which only the first returned element is actually used. Example:

F = np.where(Y>p/100)[0]

For the huge data sets that we are processing, it doesn't look like a good solution (both in terms of speed and memory consumption) to create a large array and then discard all but the first element. Is there any way how to skip the overhead, maybe by tweaking the condition?

like image 649
chris_cm Avatar asked Dec 22 '25 13:12

chris_cm


2 Answers

You can use argmax in cases where you want the first item. It returns the index of that item.

idx = np.argmax(Y > p/100)
if Y[idx] > p/100:
    F = idx
else:
    F = None
like image 81
Exprator Avatar answered Dec 24 '25 02:12

Exprator


First thing, np.where(Y>p/100)[0] is not returning the index/coordinates of the first match, but rather the first coordinate of all matches. For the coordinates of the first match, you would need next(zip(*np.where(Y>p/100))).

Assuming you really want only the first match, I don't think there is a way to stop checking the values after the first match, but you could avoid the tuple output with a vectorial operation and argmax (+ any if you're not sure to have a match):

m = Y>p/100

m.argmax() if m.any() else None

If Y is a ND-array, you will then need unravel_index to get the coordinates with the original number of dimensions:

F = np.unravel_index(m.argmax(), Y.shape)
like image 34
mozway Avatar answered Dec 24 '25 04:12

mozway



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!