Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Sort invariant for numpy.argsort with multiple dimensions

numpy.argsort docs state

index_array : ndarray, int Array of indices that sort a along the specified axis. If a is one-dimensional, a[index_array] yields a sorted a.

How can I apply the result of numpy.argsort for a multidimensional array to get back a sorted array? (NOT just a 1-D or 2-D array; it could be an N-dimensional array where N is known only at runtime)

>>> import numpy as np
>>> np.random.seed(123)
>>> A = np.random.randn(3,2)
>>> A
array([[-1.0856306 ,  0.99734545],
       [ 0.2829785 , -1.50629471],
       [-0.57860025,  1.65143654]])
>>> i=np.argsort(A,axis=-1)
>>> A[i]
array([[[-1.0856306 ,  0.99734545],
        [ 0.2829785 , -1.50629471]],

       [[ 0.2829785 , -1.50629471],
        [-1.0856306 ,  0.99734545]],

       [[-1.0856306 ,  0.99734545],
        [ 0.2829785 , -1.50629471]]])

For me it's not just a matter of using sort() instead; I have another array B and I want to order B using the results of np.argsort(A) along the appropriate axis. Consider the following example:

>>> A = np.array([[3,2,1],[4,0,6]])
>>> B = np.array([[3,1,4],[1,5,9]])
>>> i = np.argsort(A,axis=-1)
>>> BsortA = ???             
# should result in [[4,1,3],[5,1,9]]
# so that corresponding elements of B and sort(A) stay together

It looks like this functionality is already an enhancement request in numpy.

like image 371
Jason S Avatar asked Oct 31 '17 21:10

Jason S

People also ask

How does Argsort sort?

Returns the indices that would sort an array. Perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as a that index data along the given axis in sorted order.

Is NumPy Argsort stable?

NumPy's np. argsort is able to do stable sorting through passing kind = 'stable' argument.

What does Argsort () do in Python?

In Python, the NumPy library has a function called argsort() , which computes the indirect sorting of an array. It returns an array of indices along the given axis of the same shape as the input array, in sorted order.

2 Answers

The numpy issue #8708 has a sample implementation of take_along_axis that does what I need; I'm not sure if it's efficient for large arrays but it seems to work.

def take_along_axis(arr, ind, axis):
    ... here means a "pack" of dimensions, possibly empty

    arr: array_like of shape (A..., M, B...)
        source array
    ind: array_like of shape (A..., K..., B...)
        indices to take along each 1d slice of `arr`
    axis: int
        index of the axis with dimension M

    out: array_like of shape (A..., K..., B...)
        out[a..., k..., b...] = arr[a..., inds[a..., k..., b...], b...]
    if axis < 0:
       if axis >= -arr.ndim:
           axis += arr.ndim
           raise IndexError('axis out of range')
    ind_shape = (1,) * ind.ndim
    ins_ndim = ind.ndim - (arr.ndim - 1)   #inserted dimensions

    dest_dims = list(range(axis)) + [None] + list(range(axis+ins_ndim, ind.ndim))

    # could also call np.ix_ here with some dummy arguments, then throw those results away
    inds = []
    for dim, n in zip(dest_dims, arr.shape):
        if dim is None:
            ind_shape_dim = ind_shape[:dim] + (-1,) + ind_shape[dim+1:]

    return arr[tuple(inds)]

which yields

>>> A = np.array([[3,2,1],[4,0,6]])
>>> B = np.array([[3,1,4],[1,5,9]])
>>> i = A.argsort(axis=-1)
>>> take_along_axis(A,i,axis=-1)
array([[1, 2, 3],
       [0, 4, 6]])
>>> take_along_axis(B,i,axis=-1)
array([[4, 1, 3],
       [5, 1, 9]])
like image 109
Jason S Avatar answered Nov 05 '22 07:11

Jason S

This argsort produces a (3,2) array

In [453]: idx=np.argsort(A,axis=-1)
In [454]: idx
array([[0, 1],
       [1, 0],
       [0, 1]], dtype=int32)

As you note applying this to A to get the equivalent of np.sort(A, axis=-1) isn't obvious. The iterative solution is sort each row (a 1d case) with:

In [459]: np.array([x[i] for i,x in zip(idx,A)])
array([[-1.0856306 ,  0.99734545],
       [-1.50629471,  0.2829785 ],
       [-0.57860025,  1.65143654]])

While probably not the fastest, it is probably the clearest solution, and a good starting point for conceptualizing a better solution.

The tuple(inds) from the take solution is:

 array([[0, 1],
        [1, 0],
        [0, 1]], dtype=int32))
In [470]: A[_]
array([[-1.0856306 ,  0.99734545],
       [-1.50629471,  0.2829785 ],
       [-0.57860025,  1.65143654]])

In other words:

In [472]: A[np.arange(3)[:,None], idx]
array([[-1.0856306 ,  0.99734545],
       [-1.50629471,  0.2829785 ],
       [-0.57860025,  1.65143654]])

The first part is what np.ix_ would construct, but it does not 'like' the 2d idx.

Looks like I explored this topic a couple of years ago

argsort for a multidimensional ndarray

a[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]

I tried to explain what is going on. The take function does the same sort of thing, but constructs the indexing tuple for a more general case (dimensions and axis). Generalizing to more dimensions, but still with axis=-1 should be easy.

For the first axis, A[np.argsort(A,axis=0),np.arange(2)] works.

like image 23
hpaulj Avatar answered Nov 05 '22 07:11
