Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Order one numpy array by another

I have an array that determines an ordering of elements:

order = [3, 1, 4, 2]

And then I want to sort another, larger array (containing only those elements):

a = np.array([4, 2, 1, 1, 4, 3, 1, 3])    

such that the element(s) that come first in order come first in the results, etc.
In straight Python, I would do this with a key function:

sorted(a, key=order.index)
[3, 3, 1, 1, 1, 4, 4, 2]

How can I do this (efficiently) with numpy? Is there a similar notion of "key function" for numpy arrays?

like image 211
Doctor J Avatar asked Jun 08 '18 22:06

Doctor J


People also ask

How do you sort an array by another array in Python?

To sort the value of one array according to another in Python use the np. argsort(~) method.

How do I sort a NumPy array in descending order?

By Sorting a NumPy array in descending order sorts the elements from largest to smallest value. You can use the syntax array[::-1] to reverse the array. For example, sorting [5,8,6,12] in descending order results in [12 8 6 5].

Are NumPy arrays ordered?

The Python NumPy library is very general. It can use either row-major or column-major ordered arrays, but it defaults to row-major ordering.


2 Answers

Specific case : Ints

For ints, we could use bincount -

np.repeat(order,np.bincount(a)[order])

Sample run -

In [146]: sorted(a, key=order.index)
Out[146]: [3, 3, 1, 1, 1, 4, 4, 2]

In [147]: np.repeat(order,np.bincount(a)[order])
Out[147]: array([3, 3, 1, 1, 1, 4, 4, 2])

Generic case

Approach #1

Generalizing for all dtypes with bincount -

# https://stackoverflow.com/a/41242285/ @Andras Deak
def argsort_unique(idx):
    n = idx.size
    sidx = np.empty(n,dtype=int)
    sidx[idx] = np.arange(n)
    return sidx

sidx = np.argsort(order)
c = np.bincount(np.searchsorted(order,a,sorter=sidx))
out = np.repeat(order, c[argsort_unique(sidx)])

Approach #2-A

With np.unique and searchsorted for the case when all elements from order are in a -

unq, count = np.unique(a, return_counts=True)
out = np.repeat(order, count[np.searchsorted(unq, order)])

Approach #2-B

To cover for all cases, we need one extra step -

unq, count = np.unique(a, return_counts=1)
sidx = np.searchsorted(unq, order)
out = np.repeat(order, np.where(unq[sidx] == order,count[sidx],0))
like image 118
Divakar Avatar answered Nov 14 '22 22:11

Divakar


Building on @Divakar's solution, you can count how many times each element occurs and then repeat the ordered elements that many times:

c = Counter(a)
np.repeat(order, [c[v] for v in order])

(You could vectorize the count lookup if you like). I like this because it's linear time, even if it's not pure numpy.

I guess a pure numpy equivalent would look like this:

count = np.unique(a, return_counts=True)[1]
np.repeat(order, count[np.argsort(np.argsort(order))])

But that's less direct, more code, and way too many sorts. :)

like image 37
Doctor J Avatar answered Nov 14 '22 21:11

Doctor J