Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return common element indices between two numpy arrays

I have two arrays, a1 and a2. Assume len(a2) >> len(a1), and that a1 is a subset of a2.

I would like a quick way to return the a2 indices of all elements in a1. The time-intensive way to do this is obviously:

from operator import indexOf
indices = []
for i in a1:
    indices.append(indexOf(a2,i))

This of course takes a long time where a2 is large. I could also use numpy.where() instead (although each entry in a1 will appear just once in a2), but I'm not convinced it will be quicker. I could also traverse the large array just once:

for i in xrange(len(a2)):
    if a2[i] in a1:
        indices.append(i)

But I'm sure there is a faster, more 'numpy' way - I've looked through the numpy method list, but cannot find anything appropriate.

Many thanks in advance,

D

like image 250
Dave Avatar asked Feb 25 '10 11:02

Dave


People also ask

How do you find the intersection of two arrays in python NumPy?

intersect1d() function in Python. numpy. intersect1d() function find the intersection of two arrays and return the sorted, unique values that are in both of the input arrays.

How do you find the difference between two NumPy arrays?

Step 1: Import numpy. Step 2: Define two numpy arrays. Step 3: Find the set difference between these arrays using the setdiff1d() function. Step 4: Print the output.


2 Answers

How about

numpy.nonzero(numpy.in1d(a2, a1))[0]

This should be fast. From my basic testing, it's about 7 times faster than your second code snippet for len(a2) == 100, len(a1) == 10000, and only one common element at index 45. This assumes that both a1 and a2 have no repeating elements.

like image 113
Alok Singhal Avatar answered Oct 01 '22 22:10

Alok Singhal


how about:

wanted = set(a1)
indices =[idx for (idx, value) in enumerate(a2) if value in wanted]

This should be O(len(a1)+len(a2)) instead of O(len(a1)*len(a2))

NB I don't know numpy so there may be a more 'numpythonic' way to do it, but this is how I would do it in pure python.

like image 32
Dave Kirby Avatar answered Oct 01 '22 23:10

Dave Kirby