Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to compute intersecting values between two numpy arrays

Tags:

I have a bottleneck in my program which is caused by the following:

A = numpy.array([10,4,6,7,1,5,3,4,24,1,1,9,10,10,18]) B = numpy.array([1,4,5,6,7,8,9])  C = numpy.array([i for i in A if i in B]) 

The expected outcome for C is the following:

C = [4 6 7 1 5 4 1 1 9] 

Is there a more efficient way of doing this operation?

Note that array A contains repeating values and they need to be taken into account. I wasn't able to use set intersection since taking the intersection will omit the repeating values, returning just [1,4,5,6,7,9].

Also note this is only a simple demonstration. The actual array sizes can be in the order of thousands, to well over millions.

like image 776
user32147 Avatar asked Jan 15 '15 16:01

user32147


People also ask

How do you find the intersection of two arrays in NumPy?

Step 1: Import numpy. Step 2: Define two numpy arrays. Step 3: Find intersection between the arrays using the numpy. intersect1d() function.

How do you find the common values between two arrays using NumPy?

In NumPy, we can find common values between two arrays with the help intersect1d(). It will take parameter two arrays and it will return an array in which all the common elements will appear.


2 Answers

You can use np.in1d:

>>> A[np.in1d(A, B)] array([4, 6, 7, 1, 5, 4, 1, 1, 9]) 

np.in1d returns a boolean array indicating whether each value of A also appears in B. This array can then be used to index A and return the common values.

It's not relevant to your example, but it's also worth mentioning that if A and B each contain unique values then np.in1d can be sped up by setting assume_unique=True:

np.in1d(A, B, assume_unique=True) 

You might also be interested in np.intersect1d which returns an array of the unique values common to both arrays (sorted by value):

>>> np.intersect1d(A, B) array([1, 4, 5, 6, 7, 9]) 
like image 141
Alex Riley Avatar answered Nov 15 '22 13:11

Alex Riley


Use numpy.in1d:

>>> A[np.in1d(A, B)] array([4, 6, 7, 1, 5, 4, 1, 1, 9]) 
like image 25
Ashwini Chaudhary Avatar answered Nov 15 '22 13:11

Ashwini Chaudhary