Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way to implement numpy.in1d for muliple arrays

What is the best way to implement a function which takes an arbitrary number of 1d arrays and returns a tuple containing the indices of the matching values (if any).

Here is some pseudo-code of what I want to do:

a = np.array([1, 0, 4, 3, 2])
b = np.array([1, 2, 3, 4, 5])
c = np.array([4, 2])

(ind_a, ind_b, ind_c) = return_equals(a, b, c)
# ind_a = [2, 4]
# ind_b = [1, 3]
# ind_c = [0, 1]

(ind_a, ind_b, ind_c) = return_equals(a, b, c, sorted_by=a)
# ind_a = [2, 4]
# ind_b = [3, 1]
# ind_c = [0, 1]

def return_equals(*args, sorted_by=None):
    ...
like image 948
Lukas Avatar asked May 06 '15 16:05

Lukas


People also ask

How do you find the intersection of two arrays in Numpy?

Step 1: Import numpy. Step 2: Define two numpy arrays. Step 3: Find intersection between the arrays using the numpy. intersect1d() function.

How do I get rid of one Array those items that exist in another python?

Method 3: using the remove() method: The remove method is another predefined method of the array module that is used to remove a specific element from the array.

How do you find the common values between two arrays in Python?

In NumPy, we can find common values between two arrays with the help intersect1d(). It will take parameter two arrays and it will return an array in which all the common elements will appear.

Does numpy have tools for reading or writing array based datasets to disk?

Tools for reading/writing array data to disk and working with memory-mapped files. Linear algebra, random number generation, and Fourier transform capabilities. A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.


1 Answers

You can use numpy.intersect1d with reduce for this:

def return_equals(*arrays):
    matched = reduce(np.intersect1d, arrays)
    return np.array([np.where(np.in1d(array, matched))[0] for array in arrays])

reduce may be little slow here because we are creating intermediate NumPy arrays here(for large number of input it may be very slow), we can prevent this if we use Python's set and its .intersection() method:

matched = np.array(list(set(arrays[0]).intersection(*arrays[1:])))

Related GitHub ticket: n-array versions of set operations, especially intersect1d

like image 182
Ashwini Chaudhary Avatar answered Oct 06 '22 00:10

Ashwini Chaudhary