Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return values from array based on indices of common values in two other arrays

import numpy as np

a=np.random.randint(0,200,100)#rand int array
b1=np.random.randint(0,100,50)
b2=b1**3
c=[]

I have a problem I think should be easy but can't find solution, I want to find the matching values in two arrays, then use the indices of one of these to find values in another array

for i in range(len(a)):
    for j in range(len(b1)):
         if b1[j]==a[i]:
             c.append(b2[j])

c=np.asarray(c)

Clearly the above method does work, but it's very slow, and this is just an example, in the work I'm actually do a,b1,b2 are all over 10,000 elements.

Any faster solutions?

like image 822
Thomas North Avatar asked Apr 14 '15 15:04

Thomas North


2 Answers

np.in1d(b1, a) returns a boolean array indicating whether each element of b1 is found in a.

If you wanted to get the values in b2 which corresponded to the indices of common values in a and b1, you could use the boolean array to index b2:

b2[np.in1d(b1, a)]

Using this function should be a lot faster as the for loops are pushed down to the level of NumPy's internal routines.

like image 63
Alex Riley Avatar answered Nov 14 '22 22:11

Alex Riley


You can use numpy.intersect1d to get the intersection between 1d arrays.Note that when you can find the intersection then you don't need the indices or use them to find themselves again!!!

>>> a=np.random.randint(0,200,100)
>>> b1=np.random.randint(0,100,50)
>>> 
>>> np.intersect1d(b1,a)
array([ 3,  9, 17, 19, 22, 23, 37, 53, 55, 58, 67, 85, 93, 94])

You may note that using intersection is a more efficient way as for a[np.in1d(a, b1)] in addition of calling in1d function python is forced to do an extra indexing,for better understanding see the following benchmark :

import numpy as np
s1="""
import numpy as np
a=np.random.randint(0,200,100)
b1=np.random.randint(0,100,50)
np.intersect1d(b1,a)
"""
s2="""
import numpy as np
a=np.random.randint(0,200,100)
b1=np.random.randint(0,100,50)
a[np.in1d(a, b1)]
    """


print ' first: ' ,timeit(stmt=s1, number=100000)
print 'second : ',timeit(stmt=s2, number=100000)

Result:

 first:  3.69082999229
second :  7.77609300613
like image 34
Mazdak Avatar answered Nov 14 '22 23:11

Mazdak