Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find indices of a list of values in a not sorted numpy array

I'm referring to a similar question: Find indices of a list of values in a numpy array

In that case we have a master array that is sorted and another array of which we want to find the index in the master array.

master = np.array([1,2,3,4,5])
search = np.array([4,2,2,3])

The suggested solution was:

>>> master = np.array([1,2,3,4,5])
>>> search = np.array([4,2,2,3])
>>>np.searchsorted(master, search)
array([3, 1, 1, 2])

But what if master is not sorted? for example if i have two arrays like this where the first one is not sorted:

>>>master = np.array([2,3,5,4,1])
>>>search = np.array([3,2,1,4,5])

i get:

>>> np.searchsorted(master, search)
array([1, 0, 0, 2, 5])

But instead i would like:

array([1,0,4,3,2])

i.e. the indices of items in search in master.

How do i get them possibly with a native function of numpy?(not using [np.where(master==i) for i in search] )

Thanks

EDIT: In this case the search array is a permutation of master. Then i would like to find how the index of master are permuted to give a permuted array like search.

As general case, search array contain some item that maybe contained or not in the master such as:

>>>master = np.array([2,3,5,4,1])
>>>search = np.array([1,4,7])
like image 625
claudius_dev Avatar asked Oct 13 '16 13:10

claudius_dev


People also ask

How do I get the indices of sorted NumPy array?

We can get the indices of the sorted elements of a given array with the help of argsort() method. This function is used to perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as arr that that would sort the array.

How do you get NumPy indices of nonzero elements?

nonzero() function is used to Compute the indices of the elements that are non-zero. It returns a tuple of arrays, one for each dimension of arr, containing the indices of the non-zero elements in that dimension. The corresponding non-zero values in the array can be obtained with arr[nonzero(arr)] .

Can you index a NumPy array?

Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

How do you find the indices of N maximum values in a NumPy array?

For getting n-largest values from a NumPy array we have to first sort the NumPy array using numpy. argsort() function of NumPy then applying slicing concept with negative indexing. Return: [index_array, ndarray] Array of indices that sort arr along the specified axis.


1 Answers

Disclaimer: I wrote this answer for an earlier revision of the question. If you want to solve the problem in the appendix (when we aren't just looking for a permutation of an array), see Will's answer.

If all else fails, you need to sort your master array temporarily, then invert the sort order needed for this after matching the elements:

import numpy as np

master = np.array([2,3,5,4,1])
search = np.array([3,2,1,4,5])

# sorting permutation and its reverse
sorti = np.argsort(master)
sorti_inv = np.empty(sorti.shape,dtype=np.int64)
sorti_inv[sorti] = np.arange(sorti.size)

# get indices in sorted version
tmpind = np.searchsorted(master,search,sorter=sorti)

# transform indices back to original array with inverse permutation
final_inds = tmpind[sorti_inv]

The result of the above is correctly

array([1, 0, 4, 3, 2])

As you noted in a comment, your specific search and master are permutations of each other. In this case you can alternatively sort both arrays, and use the inverse permutation combined with the other direct permutation:

sorti = np.argsort(master)
sorti_inv = np.empty(sorti.shape,dtype=np.int64)
sorti_inv[sorti] = np.arange(sorti.size)
sorti_s = np.argsort(search)
final_inds = sorti_s[sorti_inv]

One should consider the effort needed to search two arrays vs searching one array in the sorted version of another. I really can't tell which one's faster.

like image 89