I have this array: <pre class="prettyprint"><code>arr = np.array([3, 7, 4]) </code></pre> And these boolean indices: <pre class="prettyprint"><code>cond = np.array([False, True, True]) </code></pre> I want to find the index of the maximum value in the array where the boolean condition is true. So I do: <pre class="prettyprint"><code>np.ma.array(arr, mask=~cond).argmax() </code></pre> Which works and returns 1. But if I had an array of boolean indices: <pre class="prettyprint"><code>cond = np.array([[False, True, True], [True, False, True]]) </code></pre> Is there a vectorized/numpy way of iterating through the array of boolean indices to return [1, 2]?

For your special use case of <code>argmax</code>, you may use <code>np.where</code> and set the masked values to negative infinity: <pre class="prettyprint"><code>>>> inf = np.iinfo('i8').max >>> np.where(cond, arr, -inf).argmax(axis=1) array([1, 2]) </code></pre> alternatively, you can manually broadcast using <code>np.tile</code>: <pre class="prettyprint"><code>>>> np.ma.array(np.tile(arr, 2).reshape(2, 3), mask=~cond).argmax(axis=1) array([1, 2]) </code></pre>

How do I vectorize this loop in numpy?

Tags:

python

numpy

I have this array:

arr = np.array([3, 7, 4])

And these boolean indices:

cond = np.array([False, True, True])

I want to find the index of the maximum value in the array where the boolean condition is true. So I do:

np.ma.array(arr, mask=~cond).argmax()

Which works and returns 1. But if I had an array of boolean indices:

cond = np.array([[False, True, True], [True, False, True]])

Is there a vectorized/numpy way of iterating through the array of boolean indices to return [1, 2]?

761

asked Aug 01 '15 23:08

capitalistcuttle

2 Answers

For your special use case of argmax, you may use np.where and set the masked values to negative infinity:

>>> inf = np.iinfo('i8').max
>>> np.where(cond, arr, -inf).argmax(axis=1)
array([1, 2])

alternatively, you can manually broadcast using np.tile:

>>> np.ma.array(np.tile(arr, 2).reshape(2, 3), mask=~cond).argmax(axis=1)
array([1, 2])

101

answered Oct 23 '22 05:10

behzad.nouri

So you want a vectorized version of:

In [302]: [np.ma.array(arr,mask=~c).argmax() for c in cond]
Out[302]: [1, 2]

What are the realistic dimensions of cond? If the number of rows is small compared to the columns (or length of arr) an iteration like this is probably not expensive.

https://stackoverflow.com/a/31767220/901925 use of tile looks good. Here I change it slightly:

In [308]: np.ma.array(np.tile(arr,(cond.shape[0],1)),mask=~cond).argmax(axis=1)
Out[308]: array([1, 2], dtype=int32)

As expected, the list comprehension times scale with the rows of cond, while the tiling approach is just a bit slower than a single row case. But with times around 92.7 µs this masked array approach is much slower than arr.argmax(). Masking adds a lot of overhead.

The where version is quite a bit faster

np.where(cond, arr, -100).argmax(1)  # 20 µs

A deleted answer suggested

(arr*cond).argmax(1)   # 8 µs

which is even faster. As proposed it didn't work if there are negative arr values. But it can probably be adjusted to handle those.

answered Oct 23 '22 05:10

hpaulj

Related questions
                            
                                Can I set dataframe values without using iterrows()?
                            
                                @gen.coroutine not defined in python with tornado
                            
                                Asyncio print status of coroutines progress
                            
                                Is there any way to use special characters inside a regex set?
                            
                                Clear postgresql and alembic and start over from scratch
                            
                                Sorted function using compare function
                            
                                beautifulsoup .get_text() is not specific enough for my HTML parsing
                            
                                Assert list of list equality without order in Python
                            
                                How to load data into existing database tables, using sqlalchemy?
                            
                                performance of pandas custom business day offset
                            
                                Allow user other than root to restart supervisorctl process?
                            
                                Python naming convention for variables that are functions
                            
                                Python append function as dot notation
                            
                                Label Areas in Python Matplotlib stackplot
                            
                                How to resize a contour in opencv 2.4.11 python? (Goal: Object extraction)
                            
                                What are the rules for automatic Django reloading when one of the code files is changed?
                            
                                Anything wrong in my Producer-Consumer implementation in Python using condition objects?
                            
                                Python 3 set default bytes encoding
                            
                                Prevent Celery Beat from running the same task
                            
                                Is python set s.difference_update(t) O(m X n)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With