Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently getting numpy.argmin and numpy.amin at the same time

Is it possible to get the result of both numpy.argmin and numpy.amin with a single call to numpy? Thanks.

like image 714
rxu Avatar asked Jan 07 '23 01:01

rxu


2 Answers

You can use np.argmin to get the indices corresponding to the minimum values and then use those with NumPy's advanced indexing to get the minimum values.

Let's take a 2D array to showcase the usage. Let A be a 2D array and let's say we are interested in finding the minimum indices and values along axis=1.

Thus, we can do -

min_idx = np.argmin(A,axis=1)
min_val = A[np.arange(A.shape[0]),min_idx]

Let's take an actual 2D array for a sample run and verify the results -

In [16]: A
Out[16]: 
array([[79, 97, 12, 54, 59],
       [44, 45, 42, 78, 32],
       [32, 41, 67, 60,  4],
       [24,  4, 85, 94, 65]])

In [17]: min_idx = np.argmin(A,axis=1)

In [18]: A[np.arange(A.shape[0]),min_idx] # Using min_idx & indexing
Out[18]: array([12, 32,  4,  4])

In [19]: np.amin(A,axis=1)                # Using np.amin to verify
Out[19]: array([12, 32,  4,  4])

Runtime test -

In [26]: def original_app(A):
    ...:     min_idx = np.argmin(A,axis=1)
    ...:     min_val = np.amin(A,axis=1)
    ...:     return min_idx, min_val
    ...: 
    ...: def proposed_app(A):
    ...:     min_idx = np.argmin(A,axis=1)
    ...:     min_val = A[np.arange(A.shape[0]),min_idx]
    ...:     return min_idx, min_val
    ...: 

In [27]: A = np.random.randint(0,99,(4000,5000))

In [28]: %timeit original_app(A)
10 loops, best of 3: 70.9 ms per loop

In [29]: %timeit proposed_app(A)
10 loops, best of 3: 33.1 ms per loop

Let's dissect the timings a bit more -

In [31]: min_idx = np.argmin(A,axis=1)

In [32]: %timeit np.argmin(A,axis=1)              # Used in both methods
10 loops, best of 3: 34.5 ms per loop

In [33]: %timeit np.amin(A,axis=1)                # Original approach
10 loops, best of 3: 37.3 ms per loop

In [34]: %timeit A[np.arange(A.shape[0]),min_idx] # Proposed approach
10000 loops, best of 3: 56 µs per loop

As we can see a major gain with the advanced indexing at the last step with negligible runtime spent on it. This allows almost 100% runtime-shaving off with it!

like image 174
Divakar Avatar answered Jan 08 '23 15:01

Divakar


Updated version

A very clean and simple approach is to use take_along_axis.

Take:

In [1]: A = np.array([[[5, 1, 2, 3]], [[3, 5, 1, 7]]])

and apply:

In [2]: min_idx = np.argmin(A, axis=2)
In [3]: min_val = np.take_along_axis(A, min_idx[:,:,None], axis=2)[:,:,0]

That's it! min_val is equal to np.amin(A, axis=2).


Old version

If you apply the solution from Divakar to more than 2 dimensions, the provided solution will not work in any case. After fighting with this problem for a long time, I found a solution which seems to work (I tested it for some cases). If you find any problem with this solution, I would be happy for any comment.

Problem

So first the problem (which is similar to the problem I had). Try for example:

In [1]: A = np.array([[[5, 1, 2, 3]], [[3, 5, 1, 7]]])

The second dimension is in this case just 1 (A has a shape of (2,1,4)). Then

In [2]: np.amin(A, axis=2)

will result in

Out[2]: array([[1],
               [1]])

which is as expected.

While on the other side, if you apply

In [3]: min_idx = np.argmin(A, axis=2)
In [4]: A[np.arange(A.shape[0]), np.arange(A.shape[1]), min_idx]

you get

Out[4]: array([[1, 5],
               [2, 1]])

which is not equal to the amin solution from In [2].

Old Solution

What is possible, is to squeeze the min_idx (from In [3]) before you use it as index and to reshape it afterwards again:

In [5]: sq_idx = np.squeeze(min_idx)
In [6]: min_pre = A[np.arange(A.shape[0]), np.arange(A.shape[1]), sq_idx]
In [7]: np.reshape(min_pre, (2,1))

which finally results in:

Out[7]: array([[1],
               [1]])

I would suggest two more steps to optimize the solution a little bit.

(1)

In order to simplify the indexing of the array:

shp = [np.arange(i) for i in A.shape]

which simplifies In [6] with:

min_pre = A[shp[0], shp[1], sq_idx]

(2)

In order to generalize the reshape, In [7] can be replaced by

np.reshape(min_pre, tuple(np.asarray(A.shape)[:-1]))

I hope it helps someone :)

like image 30
sagacity Avatar answered Jan 08 '23 15:01

sagacity