numpy search array for multiple values, and returns their indices

Tags:

How can I search for a small set of values in a numpy array (not sorted, and shouldn't be changed)? It should return the indices of those values.

For example:

a = np.array(['d', 'v', 'h', 'r', 'm', 'a'])   # in general it will be large
query = np.array(['a', 'v', 'd'])

# Required:
idnx = someNumpyFunction(a, query)

print(indx)       # should be [5, 1, 0]

I'm a beginner in numpy and I couldn't find the proper way to do this task for multiple values at the same time (I know np.where(a=='d') can do it for a single value search).

291

asked Sep 27 '22 17:09

Doaa

2 Answers

A classic way of checking one array against another is adjust the shape and use '==':

In [250]: arr==query[:,None]
Out[250]: 
array([[False, False, False, False, False,  True],
       [False,  True, False, False, False, False],
       [ True, False, False, False, False, False]], dtype=bool)

In [251]: np.where(arr==query[:,None])
Out[251]: (array([0, 1, 2]), array([5, 1, 0]))

If an element query isn't found in a, its 'row' will be missing, e.g. [0,2] instead of [0,1,2]

In [261]: np.where(arr==np.array(['a','x','v'],dtype='S')[:,None])
Out[261]: (array([0, 2]), array([5, 1]))

For this small example, it is considerably faster than a list comprehension equivalent:

np.hstack([(arr==i).nonzero()[0] for i in query])

It's a little slower than the searchsorted solution. (In that solution i is out of bounds if query element is not found).

Stefano suggested fromiter. It saves some time compared to hstack of a list:

In [313]: timeit np.hstack([(arr==i).nonzero()[0] for i in query])10000 loops, best of 3: 49.5 us per loop

In [314]: timeit np.fromiter(((arr==i).nonzero()[0] for i in query), dtype=int, count=len(query))
10000 loops, best of 3: 35.3 us per loop

But if raises an error is an element is missing, or if there are multiple occurances. hstack can handle variable length entries, fromiter cannot.

np.flatnonzero(arr==i) is slower than ().nonzero()[0], but I haven't looked into why.

129

answered Oct 16 '22 23:10

hpaulj

You can use np.searchsorted on the sorted array, then revert the returned indices to the original array. For that you may use np.argsort; as in:

>>> indx = a.argsort()  # indices that would sort the array
>>> i = np.searchsorted(a[indx], query)  # indices in the sorted array
>>> indx[i]  # indices with respect to the original array
array([5, 1, 0])

if a is of size n and query is of size k, this will be O(n log n + k log n) which would be faster than O(n k) for linear search if log n < k.

answered Oct 16 '22 21:10

behzad.nouri

Related questions
                            
                                Trying to implement recursive Tower of Hanoi algorithm with arrays
                            
                                python numpy mask mean performance
                            
                                MongoDB $addToSet vs $push (speed)
                            
                                Why does `.forEach` work on dense arrays but not on sparse arrays? [duplicate]
                            
                                How to return just the matched elements from a mongoDB array
                            
                                Sorting a complex structure of array of array
                            
                                Getting incorrect values when accessing variables passed along in a pointer to a character array for strtok
                            
                                After removing an element from array using splice. its not resetting.Is my code has any mistake
                            
                                Storing a C# reference to an array of structs and retrieving it - possible without copying?
                            
                                How can a single for loop iterate over multiple arrays?
                            
                                Input format for Kruskal-Wallis test in Python
                            
                                MongoDB: Convert array to object
                            
                                The size of an initialized array as element of the array (USB descriptor)
                            
                                Is it possible to declare an array data type column? SQL
                            
                                JavaScript Arrays with Async Functions
                            
                                Pair of elements from a specified array whose sum equals a specific target number
                            
                                Why Array.forEach is slower than for() loop in Javascript? [duplicate]
                            
                                Remove object from array based on array of some property of that object
                            
                                Mapping an array to a file via Mmap in Go
                            
                                Sorting: Sort array based on multiple conditions in Ruby

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

numpy search array for multiple values, and returns their indices

Tags:

arrays

search

python-3.x

numpy

Doaa

People also ask

2 Answers

hpaulj

behzad.nouri

Recent Activity

Donate For Us