a
and b
are two Numpy arrays of integers. They are sorted and without repetitions. b
is a subset of a
. I need to find the index in a
of every element of b
. Is there an efficient Numpy function that could help, so I can avoid the python loop?
(Actually, the arrays are of pandas.DatetimeIndex
and Numpy datetime64
, but I guess it doesn't change the answer.)
numpy.searchsorted()
can be used to do this:
In [15]: a = np.array([1, 2, 3, 5, 10, 20, 25])
In [16]: b = np.array([1, 5, 20, 25])
In [17]: a.searchsorted(b)
Out[17]: array([0, 3, 5, 6])
From what I understand, it doesn't require b
to be sorted, and uses binary search on a
. This means that it's O(n logn) rather than O(n).
If that's not good enough, there's always Cython. :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With