python: check if an numpy array contains any element of another array

Tags:

numpy

What is the best way to check if an numpy array contains any element of another array?

example:

array1 = [10,5,4,13,10,1,1,22,7,3,15,9]
array2 = [3,4,9,10,13,15,16,18,19,20,21,22,23]`

I want to get a True if array1 contains any value of array2, otherwise a False.

248

asked Mar 23 '16 23:03

2 Answers

Using Pandas, you can use isin:

a1 = np.array([10,5,4,13,10,1,1,22,7,3,15,9])
a2 = np.array([3,4,9,10,13,15,16,18,19,20,21,22,23])

>>> pd.Series(a1).isin(a2).any()
True

And using the in1d numpy function(per the comment from @Norman):

>>> np.any(np.in1d(a1, a2))
True

For small arrays such as those in this example, the solution using set is the clear winner. For larger, dissimilar arrays (i.e. no overlap), the Pandas and Numpy solutions are faster. However, np.intersect1d appears to excel for larger arrays.

Small arrays (12-13 elements)

%timeit set(array1) & set(array2)
The slowest run took 4.22 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 1.69 µs per loop

%timeit any(i in a1 for i in a2)
The slowest run took 12.29 times longer than the fastest. This could mean that an intermediate result is being cached 
100000 loops, best of 3: 1.88 µs per loop

%timeit np.intersect1d(a1, a2)
The slowest run took 10.29 times longer than the fastest. This could mean that an intermediate result is being cached 
100000 loops, best of 3: 15.6 µs per loop

%timeit np.any(np.in1d(a1, a2))
10000 loops, best of 3: 27.1 µs per loop

%timeit pd.Series(a1).isin(a2).any()
10000 loops, best of 3: 135 µs per loop

Using an array with 100k elements (no overlap):

a3 = np.random.randint(0, 100000, 100000)
a4 = a3 + 100000

%timeit np.intersect1d(a3, a4)
100 loops, best of 3: 13.8 ms per loop    

%timeit pd.Series(a3).isin(a4).any()
100 loops, best of 3: 18.3 ms per loop

%timeit np.any(np.in1d(a3, a4))
100 loops, best of 3: 18.4 ms per loop

%timeit set(a3) & set(a4)
10 loops, best of 3: 23.6 ms per loop

%timeit any(i in a3 for i in a4)
1 loops, best of 3: 34.5 s per loop

200

answered Oct 24 '22 16:10

Alexander

You can try this

>>> array1 = [10,5,4,13,10,1,1,22,7,3,15,9]
>>> array2 = [3,4,9,10,13,15,16,18,19,20,21,22,23]
>>> set(array1) & set(array2)
set([3, 4, 9, 10, 13, 15, 22])

If you get result means there are common elements in both array.

If result is empty means no common elements.

answered Oct 24 '22 15:10

Nilesh

Related questions
                            
                                Why does the shelve module in python sometimes create files with different extensions?
                            
                                learning python and also trying to implement scrapy ..getting this error
                            
                                multiprocessing.Pool with a global variable
                            
                                Django's LiveServerTestCase Always Fails Due to Conflicting Address... Despite Address Appearing Free
                            
                                Warning (from warnings module): ResourceWarning: unclosed <socket.socket object, fd=404, family=2, type=1, proto=0> using selenium
                            
                                How to redirect to an external domain in Flask?
                            
                                "Got 1 columns instead of ..." error in numpy
                            
                                BS4: Getting text in tag
                            
                                Python OpenCV Ellipse - takes at most 5 arguments (8 given)
                            
                                Installing NumPy via Anaconda in Windows
                            
                                How to add a delay to supervised process in supervisor - linux
                            
                                Python Sniffing from Black Hat Python book
                            
                                pandas series: change order of index
                            
                                multiprocessing.Queue and Queue.Queue are different?
                            
                                Multivariate Normal CDF in Python using scipy
                            
                                Python TypeError: expected string or buffer
                            
                                How to change parent attribute in subclass python
                            
                                Hiding console output produced by os.system
                            
                                Speeding up reading of very large netcdf file in python
                            
                                Matplotlib: user defined plot function print twice

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

python: check if an numpy array contains any element of another array

Tags:

python

numpy

Alex

People also ask

2 Answers

Alexander

Nilesh

Recent Activity

Donate For Us