<code>numpy.argsort</code> docs state <blockquote> Returns: index_array : ndarray, int Array of indices that sort a along the specified axis. If a is one-dimensional, <code>a[index_array]</code> yields a sorted a. </blockquote> How can I apply the result of <code>numpy.argsort</code> for a multidimensional array to get back a sorted array? (NOT just a 1-D or 2-D array; it could be an N-dimensional array where N is known only at runtime) <pre class="prettyprint"><code>>>> import numpy as np >>> np.random.seed(123) >>> A = np.random.randn(3,2) >>> A array([[-1.0856306 , 0.99734545], [ 0.2829785 , -1.50629471], [-0.57860025, 1.65143654]]) >>> i=np.argsort(A,axis=-1) >>> A[i] array([[[-1.0856306 , 0.99734545], [ 0.2829785 , -1.50629471]], [[ 0.2829785 , -1.50629471], [-1.0856306 , 0.99734545]], [[-1.0856306 , 0.99734545], [ 0.2829785 , -1.50629471]]]) </code></pre> For me it's not just a matter of using <code>sort()</code> instead; I have another array <code>B</code> and I want to order <code>B</code> using the results of <code>np.argsort(A)</code> along the appropriate axis. Consider the following example: <pre class="prettyprint"><code>>>> A = np.array([[3,2,1],[4,0,6]]) >>> B = np.array([[3,1,4],[1,5,9]]) >>> i = np.argsort(A,axis=-1) >>> BsortA = ??? # should result in [[4,1,3],[5,1,9]] # so that corresponding elements of B and sort(A) stay together </code></pre> <hr> It looks like this functionality is already an enhancement request in numpy.

This argsort produces a (3,2) array <pre class="prettyprint"><code>In [453]: idx=np.argsort(A,axis=-1) In [454]: idx Out[454]: array([[0, 1], [1, 0], [0, 1]], dtype=int32) </code></pre> As you note applying this to <code>A</code> to get the equivalent of <code>np.sort(A, axis=-1)</code> isn't obvious. The iterative solution is sort each row (a 1d case) with: <pre class="prettyprint"><code>In [459]: np.array([x[i] for i,x in zip(idx,A)]) Out[459]: array([[-1.0856306 , 0.99734545], [-1.50629471, 0.2829785 ], [-0.57860025, 1.65143654]]) </code></pre> While probably not the fastest, it is probably the clearest solution, and a good starting point for conceptualizing a better solution. The <code>tuple(inds)</code> from the <code>take</code> solution is: <pre class="prettyprint"><code>(array([[0], [1], [2]]), array([[0, 1], [1, 0], [0, 1]], dtype=int32)) In [470]: A[_] Out[470]: array([[-1.0856306 , 0.99734545], [-1.50629471, 0.2829785 ], [-0.57860025, 1.65143654]]) </code></pre> In other words: <pre class="prettyprint"><code>In [472]: A[np.arange(3)[:,None], idx] Out[472]: array([[-1.0856306 , 0.99734545], [-1.50629471, 0.2829785 ], [-0.57860025, 1.65143654]]) </code></pre> The first part is what <code>np.ix_</code> would construct, but it does not 'like' the 2d <code>idx</code>. <hr> Looks like I explored this topic a couple of years ago argsort for a multidimensional ndarray <pre class="prettyprint"><code>a[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)] </code></pre> I tried to explain what is going on. The <code>take</code> function does the same sort of thing, but constructs the indexing tuple for a more general case (dimensions and axis). Generalizing to more dimensions, but still with <code>axis=-1</code> should be easy. For the first axis, <code>A[np.argsort(A,axis=0),np.arange(2)]</code> works.

Sort invariant for numpy.argsort with multiple dimensions

Tags:

python

arrays

sorting

numpy

numpy.argsort docs state

Returns:
index_array : ndarray, int Array of indices that sort a along the specified axis. If a is one-dimensional, a[index_array] yields a sorted a.

How can I apply the result of numpy.argsort for a multidimensional array to get back a sorted array? (NOT just a 1-D or 2-D array; it could be an N-dimensional array where N is known only at runtime)

>>> import numpy as np
>>> np.random.seed(123)
>>> A = np.random.randn(3,2)
>>> A
array([[-1.0856306 ,  0.99734545],
       [ 0.2829785 , -1.50629471],
       [-0.57860025,  1.65143654]])
>>> i=np.argsort(A,axis=-1)
>>> A[i]
array([[[-1.0856306 ,  0.99734545],
        [ 0.2829785 , -1.50629471]],

       [[ 0.2829785 , -1.50629471],
        [-1.0856306 ,  0.99734545]],

       [[-1.0856306 ,  0.99734545],
        [ 0.2829785 , -1.50629471]]])

For me it's not just a matter of using sort() instead; I have another array B and I want to order B using the results of np.argsort(A) along the appropriate axis. Consider the following example:

>>> A = np.array([[3,2,1],[4,0,6]])
>>> B = np.array([[3,1,4],[1,5,9]])
>>> i = np.argsort(A,axis=-1)
>>> BsortA = ???             
# should result in [[4,1,3],[5,1,9]]
# so that corresponding elements of B and sort(A) stay together

It looks like this functionality is already an enhancement request in numpy.

371

asked Oct 31 '17 21:10

Jason S

2 Answers

The numpy issue #8708 has a sample implementation of take_along_axis that does what I need; I'm not sure if it's efficient for large arrays but it seems to work.

def take_along_axis(arr, ind, axis):
    """
    ... here means a "pack" of dimensions, possibly empty

    arr: array_like of shape (A..., M, B...)
        source array
    ind: array_like of shape (A..., K..., B...)
        indices to take along each 1d slice of `arr`
    axis: int
        index of the axis with dimension M

    out: array_like of shape (A..., K..., B...)
        out[a..., k..., b...] = arr[a..., inds[a..., k..., b...], b...]
    """
    if axis < 0:
       if axis >= -arr.ndim:
           axis += arr.ndim
       else:
           raise IndexError('axis out of range')
    ind_shape = (1,) * ind.ndim
    ins_ndim = ind.ndim - (arr.ndim - 1)   #inserted dimensions

    dest_dims = list(range(axis)) + [None] + list(range(axis+ins_ndim, ind.ndim))

    # could also call np.ix_ here with some dummy arguments, then throw those results away
    inds = []
    for dim, n in zip(dest_dims, arr.shape):
        if dim is None:
            inds.append(ind)
        else:
            ind_shape_dim = ind_shape[:dim] + (-1,) + ind_shape[dim+1:]
            inds.append(np.arange(n).reshape(ind_shape_dim))

    return arr[tuple(inds)]

which yields

>>> A = np.array([[3,2,1],[4,0,6]])
>>> B = np.array([[3,1,4],[1,5,9]])
>>> i = A.argsort(axis=-1)
>>> take_along_axis(A,i,axis=-1)
array([[1, 2, 3],
       [0, 4, 6]])
>>> take_along_axis(B,i,axis=-1)
array([[4, 1, 3],
       [5, 1, 9]])

109

answered Nov 05 '22 07:11

Jason S

This argsort produces a (3,2) array

In [453]: idx=np.argsort(A,axis=-1)
In [454]: idx
Out[454]: 
array([[0, 1],
       [1, 0],
       [0, 1]], dtype=int32)

As you note applying this to A to get the equivalent of np.sort(A, axis=-1) isn't obvious. The iterative solution is sort each row (a 1d case) with:

In [459]: np.array([x[i] for i,x in zip(idx,A)])
Out[459]: 
array([[-1.0856306 ,  0.99734545],
       [-1.50629471,  0.2829785 ],
       [-0.57860025,  1.65143654]])

While probably not the fastest, it is probably the clearest solution, and a good starting point for conceptualizing a better solution.

The tuple(inds) from the take solution is:

(array([[0],
        [1],
        [2]]), 
 array([[0, 1],
        [1, 0],
        [0, 1]], dtype=int32))
In [470]: A[_]
Out[470]: 
array([[-1.0856306 ,  0.99734545],
       [-1.50629471,  0.2829785 ],
       [-0.57860025,  1.65143654]])

In other words:

In [472]: A[np.arange(3)[:,None], idx]
Out[472]: 
array([[-1.0856306 ,  0.99734545],
       [-1.50629471,  0.2829785 ],
       [-0.57860025,  1.65143654]])

The first part is what np.ix_ would construct, but it does not 'like' the 2d idx.

Looks like I explored this topic a couple of years ago

argsort for a multidimensional ndarray

a[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]

I tried to explain what is going on. The take function does the same sort of thing, but constructs the indexing tuple for a more general case (dimensions and axis). Generalizing to more dimensions, but still with axis=-1 should be easy.

For the first axis, A[np.argsort(A,axis=0),np.arange(2)] works.

answered Nov 05 '22 07:11

hpaulj

Related questions
                            
                                Calculating gradient norm wrt weights with keras
                            
                                Seaborn: How to replace index with text in X-Axis in barplot?
                            
                                using groupby/aggregate to return multiple columns
                            
                                Python libraries on Web Job
                            
                                Marking dynamic substrings in a list of strings
                            
                                Tensorboard: File system scheme gs not implemented
                            
                                Python list of substrings in list of strings
                            
                                selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH error with Headless Chrome
                            
                                Flask request.get_json() raise BadRequest
                            
                                Typehinting lambda function as function argument
                            
                                Matplotlib: Color and linestyle by two different variables with separate legends
                            
                                scrape with correct character encoding (python requests + beautifulsoup)
                            
                                module 'importlib._bootstrap' has no attribute '_w_long'
                            
                                impyla (0.14.0) ERROR - 'TSocket' object has no attribute 'isOpen'
                            
                                Python3 best way to read unknown multi line input
                            
                                Change numerical Data to Categorical Data - Pandas [duplicate]
                            
                                Why is the following simple parallelized code much slower than a simple loop in Python?
                            
                                Factorize values across dataframe columns with consistent mappings
                            
                                Numpy: how to use argmax results to get the actual max? [duplicate]
                            
                                Python: Storing values in a 3D array to csv

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sort invariant for numpy.argsort with multiple dimensions

Tags:

python

arrays

sorting

numpy

Jason S

People also ask

2 Answers

Jason S

hpaulj

Recent Activity

Donate For Us