I have a large (millions) array of ID numbers <code>ids</code>, and I want to find the indices where another array of targets (<code>targets</code>) exist in the <code>ids</code> array. For example, if <pre class="prettyprint"><code>ids = [22, 5, 4, 0, 100] targets = [5, 0] </code></pre> then I want the result: <pre class="prettyprint"><code>>>> [1,3] </code></pre> If I pre-sort the array of <code>ids</code>, then it's easy to find matches using <code>numpy.searchsorted</code>, e.g. <pre class="prettyprint"><code>>>> ids = np.array([22, 5, 4, 0, 100]) >>> targets = [5, 0] >>> sort = np.argsort(ids) >>> ids[sort] [0,4,5,22,100] >>> np.searchsorted(ids, targets, sorter=sort) [2,0] </code></pre> But how can I find the reverse mapping to 'unsort' this result? I.e. to map the sorted entries at <code>[2,0]</code> back to where they were before: <code>[1,3]</code>.

There are a few answers dancing around this already, but just to make it clear all you need to do is use <code>sort[rank]</code>. <pre class="prettyprint"><code># Setup ids = np.array([22, 5, 4, 0, 100]) targets = np.array([5, 0]) sort = np.argsort(ids) rank = np.searchsorted(ids, targets, sorter=sort) print(sort[rank]) # array([1, 3]) </code></pre>

Could you just do this? <pre class="prettyprint"><code>sort[np.searchsorted(ids, targets, sorter=sort)] </code></pre> Alternatively: <pre class="prettyprint"><code>np.hstack([np.where(ids==x)[0] for x in targets]) </code></pre> both give: <pre class="prettyprint"><code>array([1, 3]) </code></pre>

I think I've come up with something. We can construct a 'cipher' or sorts: <code>key = numpy.arange(len(ids))</code> applying the initial sorter to this key then gives the reverse mapping: <code>revsort = key[np.argsort(ids)]</code> <hr> edit: as @birico points out, <code>key[sort]</code> is identical to <code>sort</code> itself! <pre class="prettyprint"><code>>>> sort = np.argsort(ids) >>> ids[sort] [0,4,5,22,100] >>> found = np.searchsorted(ids, targets, sorter=sort) >>> found [2,0] >>> sort[found] [1,3] </code></pre>

Find unsorted indices after using numpy.searchsorted

I have a large (millions) array of ID numbers ids, and I want to find the indices where another array of targets (targets) exist in the ids array. For example, if

ids = [22, 5, 4, 0, 100]
targets = [5, 0]

then I want the result:

>>> [1,3]

If I pre-sort the array of ids, then it's easy to find matches using numpy.searchsorted, e.g.

>>> ids = np.array([22, 5, 4, 0, 100])
>>> targets = [5, 0]
>>> sort = np.argsort(ids)
>>> ids[sort]
[0,4,5,22,100]
>>> np.searchsorted(ids, targets, sorter=sort)
[2,0]

But how can I find the reverse mapping to 'unsort' this result? I.e. to map the sorted entries at [2,0] back to where they were before: [1,3].

How do I get the indices of a sorted NumPy array?

We can get the indices of the sorted elements of a given array with the help of argsort() method. This function is used to perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as arr that that would sort the array.

What does np searchsorted do?

searchsorted() function is used to find the indices into a sorted array arr such that, if elements are inserted before the indices, the order of arr would be still preserved. Here, binary search is used to find the required insertion indices.

Can you sort NumPy array?

The NumPy ndarray object has a function called sort() , that will sort a specified array.

There are a few answers dancing around this already, but just to make it clear all you need to do is use sort[rank].

# Setup
ids = np.array([22, 5, 4, 0, 100])
targets = np.array([5, 0])

sort = np.argsort(ids)
rank = np.searchsorted(ids, targets, sorter=sort)
print(sort[rank])
# array([1, 3])

Could you just do this?

sort[np.searchsorted(ids, targets, sorter=sort)]

Alternatively:

np.hstack([np.where(ids==x)[0] for x in targets])

both give:

array([1, 3])

I think I've come up with something.

We can construct a 'cipher' or sorts: key = numpy.arange(len(ids)) applying the initial sorter to this key then gives the reverse mapping: revsort = key[np.argsort(ids)]

edit: as @birico points out, key[sort] is identical to sort itself!

>>> sort = np.argsort(ids)
>>> ids[sort]
[0,4,5,22,100]
>>> found = np.searchsorted(ids, targets, sorter=sort)
>>> found
[2,0]
>>> sort[found]
[1,3]

Find unsorted indices after using numpy.searchsorted

Tags:

python

algorithm

sorting

search

numpy

DilithiumMatrix

People also ask

3 Answers

Bi Rico

atomh33ls

DilithiumMatrix

Recent Activity

Donate For Us

Find unsorted indices after using numpy.searchsorted

Tags:

python

algorithm

sorting

search

numpy

DilithiumMatrix

People also ask

3 Answers

Bi Rico

atomh33ls

DilithiumMatrix

Related questions

Recent Activity

Donate For Us