Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy: argmax over multiple axes without loop

I have a N-dimensional array (Named A). For each row of the first axis of A, I want to obtain the coordinates of the maximum value along the other axes of A. Then I would return a 2-dimensional array with the coordinates of the maximum value for each row of the first axis of A.

I already solved my problem using a loop, but I was wondering whether there is a more efficient way of doing this. My current solution (for an example array A) is as follows:

import numpy as np

A=np.reshape(np.concatenate((np.arange(0,12),np.arange(0,-4,-1))),(4,2,2))
maxpos=np.empty(shape=(4,2))
for n in range(0, 4):
    maxpos[n,:]=np.unravel_index(np.argmax(A[n,:,:]), A[n,:,:].shape)

Here, we would have:

A: 
[[[ 0  1]
  [ 2  3]]

 [[ 4  5]
  [ 6  7]]

 [[ 8  9]
  [10 11]]

 [[ 0 -1]
  [-2 -3]]]

maxpos:
[[ 1.  1.]
 [ 1.  1.]
 [ 1.  1.]
 [ 0.  0.]]

If there are multiple maximizers, I don't mind which is chosen.

I have tried to use np.apply_over_axes, but I haven't managed to make it return the outcome I want.

like image 833
CarlosH Avatar asked Jun 02 '15 06:06

CarlosH


People also ask

What is argmax in NumPy?

If you use it, np.argmax will retrieve the index values for the maxima along particular axes. But if you don’t use it, then argmax will flatten out the array and retrieve the index of the maxima of the flattened array. To be honest, how axes work is little difficult to understand without examples.

How to get the Max of an array in NumPy?

numpy.argmax(array, axis = None, out = None) : Returns indices of the max element of the array in a particular axis. Parameters : array : Input array to work on axis : [int, optional]Along a specified axis like 0 or 1 out : [array optional]Provides a feature to insert output to the out array and it should be of appropriate shape and dtype.

How do you apply a function over multiple axes in NumPy?

numpy.apply_over_axes(func, a, axes) [source] ¶ Apply a function repeatedly over multiple axes. func is called as res = func (a, axis), where axis is the first element of axes. The result res of the function call must have either the same dimensions as a or one less dimension.

What is the difference between NumPy apply_over_axes () and 1d_func ()?

The numpy.apply_over_axes () applies a function repeatedly over multiple axes in an array. 1d_func : the required function to perform over 1D array.


2 Answers

You can use a list comprehension

result = [np.unravel_index(np.argmax(r), r.shape) for r in a]

it's IMO more readable but the speed is going to be not much better than an explicit loop.

The fact that the main outer loop is in Python should matter only if the first dimension is actually the very big one.

If this is the case (i.e. you have ten millions of 2x2 matrices) then flipping is faster...

# true if 0,0 is not smaller than others
m00 = ((data[:,0,0] >= data[:,1,0]) &
       (data[:,0,0] >= data[:,0,1]) &
       (data[:,0,0] >= data[:,1,1]))

# true if 0,1 is not smaller than others
m01 = ((data[:,0,1] >= data[:,1,0]) &
       (data[:,0,1] >= data[:,0,0]) &
       (data[:,0,1] >= data[:,1,1]))

# true if 1,0 is not smaller than others
m10 = ((data[:,1,0] >= data[:,0,0]) &
       (data[:,1,0] >= data[:,0,1]) &
       (data[:,1,0] >= data[:,1,1]))

# true if 1,1 is not smaller than others
m11 = ((data[:,1,1] >= data[:,1,0]) &
       (data[:,1,1] >= data[:,0,1]) &
       (data[:,1,1] >= data[:,0,0]))

# choose which is max on equality
m01 &= ~m00
m10 &= ~(m00|m01)
m11 &= ~(m00|m01|m10)

# compute result
result = np.zeros((len(data), 2), np.int32)
result[:,1] |= m01|m11
result[:,0] |= m10|m11

On my machine the code above is about 50 times faster (for one million of 2x2 matrices).

like image 188
6502 Avatar answered Sep 17 '22 21:09

6502


You could do something like this -

# Reshape input array to a 2D array with rows being kept as with original array.
# Then, get idnices of max values along the columns.
max_idx = A.reshape(A.shape[0],-1).argmax(1)

# Get unravel indices corresponding to original shape of A
maxpos_vect = np.column_stack(np.unravel_index(max_idx, A[0,:,:].shape))

Sample run -

In [214]: # Input array
     ...: A = np.random.rand(5,4,3,7,8)

In [215]: # Setup output array and use original loopy code
     ...: maxpos=np.empty(shape=(5,4)) # 4 because ndims in A is 5
     ...: for n in range(0, 5):
     ...:     maxpos[n,:]=np.unravel_index(np.argmax(A[n,:,:,:,:]), A[n,:,:,:,:].shape)
     ...:     

In [216]: # Proposed approach
     ...: max_idx = A.reshape(A.shape[0],-1).argmax(1)
     ...: maxpos_vect = np.column_stack(np.unravel_index(max_idx, A[0,:,:].shape))
     ...: 

In [219]: # Verify results
     ...: np.array_equal(maxpos.astype(int),maxpos_vect)
Out[219]: True

Generalize to n-dim array

We could generalize to solve for n-dim array to get argmax for last N axes combined with something like this -

def argmax_lastNaxes(A, N):
    s = A.shape
    new_shp = s[:-N] + (np.prod(s[-N:]),)
    max_idx = A.reshape(new_shp).argmax(-1)
    return np.unravel_index(max_idx, s[-N:])

The result would a tuple of arrays of indices. If you need the final output as an array, we can use np.stack or np.concatenate.

like image 40
Divakar Avatar answered Sep 19 '22 21:09

Divakar