Return values from array based on indices of common values in two other arrays

Question

import numpy as np

a=np.random.randint(0,200,100)#rand int array
b1=np.random.randint(0,100,50)
b2=b1**3
c=[]

I have a problem I think should be easy but can't find solution, I want to find the matching values in two arrays, then use the indices of one of these to find values in another array

for i in range(len(a)):
    for j in range(len(b1)):
         if b1[j]==a[i]:
             c.append(b2[j])

c=np.asarray(c)

Clearly the above method does work, but it's very slow, and this is just an example, in the work I'm actually do a,b1,b2 are all over 10,000 elements.

Any faster solutions?

Alex Riley · Accepted Answer

np.in1d(b1, a) returns a boolean array indicating whether each element of b1 is found in a.

If you wanted to get the values in b2 which corresponded to the indices of common values in a and b1, you could use the boolean array to index b2:

b2[np.in1d(b1, a)]

Using this function should be a lot faster as the for loops are pushed down to the level of NumPy's internal routines.

Mazdak · Answer

You can use numpy.intersect1d to get the intersection between 1d arrays.Note that when you can find the intersection then you don't need the indices or use them to find themselves again!!!

>>> a=np.random.randint(0,200,100)
>>> b1=np.random.randint(0,100,50)
>>> 
>>> np.intersect1d(b1,a)
array([ 3,  9, 17, 19, 22, 23, 37, 53, 55, 58, 67, 85, 93, 94])

You may note that using intersection is a more efficient way as for a[np.in1d(a, b1)] in addition of calling in1d function python is forced to do an extra indexing,for better understanding see the following benchmark :

import numpy as np
s1="""
import numpy as np
a=np.random.randint(0,200,100)
b1=np.random.randint(0,100,50)
np.intersect1d(b1,a)
"""
s2="""
import numpy as np
a=np.random.randint(0,200,100)
b1=np.random.randint(0,100,50)
a[np.in1d(a, b1)]
    """


print ' first: ' ,timeit(stmt=s1, number=100000)
print 'second : ',timeit(stmt=s2, number=100000)

Result:

 first:  3.69082999229
second :  7.77609300613

Return values from array based on indices of common values in two other arrays

Tags:

performance

python

arrays

numpy

python-2.7

Thomas North

2 Answers

Alex Riley

Mazdak

Recent Activity

Donate For Us

Return values from array based on indices of common values in two other arrays

Tags:

performance

python

arrays

numpy

python-2.7

Thomas North

2 Answers

Alex Riley

Mazdak

Related questions

Recent Activity

Donate For Us