Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find multiple values in a Numpy array

a and b are two Numpy arrays of integers. They are sorted and without repetitions. b is a subset of a. I need to find the index in a of every element of b. Is there an efficient Numpy function that could help, so I can avoid the python loop?

(Actually, the arrays are of pandas.DatetimeIndex and Numpy datetime64, but I guess it doesn't change the answer.)

like image 493
Yariv Avatar asked Dec 16 '22 13:12

Yariv


1 Answers

numpy.searchsorted() can be used to do this:

In [15]: a = np.array([1, 2, 3, 5, 10, 20, 25])

In [16]: b = np.array([1, 5, 20, 25])

In [17]: a.searchsorted(b)
Out[17]: array([0, 3, 5, 6])

From what I understand, it doesn't require b to be sorted, and uses binary search on a. This means that it's O(n logn) rather than O(n).

If that's not good enough, there's always Cython. :-)

like image 128
NPE Avatar answered Dec 21 '22 10:12

NPE