I have two arrays, a1 and a2. Assume len(a2) >> len(a1)
, and that a1 is a subset of a2.
I would like a quick way to return the a2 indices of all elements in a1. The time-intensive way to do this is obviously:
from operator import indexOf
indices = []
for i in a1:
indices.append(indexOf(a2,i))
This of course takes a long time where a2 is large. I could also use numpy.where() instead (although each entry in a1 will appear just once in a2), but I'm not convinced it will be quicker. I could also traverse the large array just once:
for i in xrange(len(a2)):
if a2[i] in a1:
indices.append(i)
But I'm sure there is a faster, more 'numpy' way - I've looked through the numpy method list, but cannot find anything appropriate.
Many thanks in advance,
D
intersect1d() function in Python. numpy. intersect1d() function find the intersection of two arrays and return the sorted, unique values that are in both of the input arrays.
Step 1: Import numpy. Step 2: Define two numpy arrays. Step 3: Find the set difference between these arrays using the setdiff1d() function. Step 4: Print the output.
How about
numpy.nonzero(numpy.in1d(a2, a1))[0]
This should be fast. From my basic testing, it's about 7 times faster than your second code snippet for len(a2) == 100
, len(a1) == 10000
, and only one common element at index 45. This assumes that both a1
and a2
have no repeating elements.
how about:
wanted = set(a1)
indices =[idx for (idx, value) in enumerate(a2) if value in wanted]
This should be O(len(a1)+len(a2)) instead of O(len(a1)*len(a2))
NB I don't know numpy so there may be a more 'numpythonic' way to do it, but this is how I would do it in pure python.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With