I have quite a simple scenario where I'd like to test whether both elements of a two-dimensional array are (separately) members of a larger array - for example: <pre class="prettyprint"><code>full_array = np.array(['A','B','C','D','E','F']) sub_arrays = np.array([['A','C','F'], ['B','C','E']]) np.isin(full_array, sub_arrays) </code></pre> This gives me a single dimension output: <pre class="prettyprint"><code>array([ True, True, True, False, True, True]) </code></pre> showing whether elements of full_array are present in either of the two sub-arrays. I'd like instead a two-dimensional array showing the same thing for each of the two elements in sub_arrays - so: <pre class="prettyprint"><code>array([[ True, False, True, False, False, True], [ False, True, True, False, True, False]]) </code></pre> Hope that makes sense and any help gratefully received.

<h3>Broadcasting based one</h3> A simple one would be with <code>broadcasting</code> after extending one of the arrays and then any-reduction along the respective axis - <pre class="prettyprint"><code>In [140]: (full_array==sub_arrays[...,None]).any(axis=1) Out[140]: array([[ True, False, True, False, False, True], [False, True, True, False, True, False]]) </code></pre> <h3>With <code>searchsorted</code> </h3> Specific case #1 With <code>full_array</code> being sorted and all elements from <code>sub_arrays</code> present at least somewhere in <code>full_array</code>, we can also use <code>np.searchsorted</code> - <pre class="prettyprint"><code>idx = np.searchsorted(full_array, sub_arrays) out = np.zeros((sub_arrays.shape[0],len(full_array)),dtype=bool) np.put_along_axis(out, idx, 1, axis=1) </code></pre> Specific case #2 With <code>full_array</code> being sorted and if not all elements from <code>sub_arrays</code> are guaranteed to be present at least somewhere in <code>full_array</code>, we need one extra step - <pre class="prettyprint"><code>idx = np.searchsorted(full_array, sub_arrays) idx[idx==len(full_array)] = 0 out = np.zeros((sub_arrays.shape[0],len(full_array)),dtype=bool) np.put_along_axis(out, idx, full_array[idx] == sub_arrays, axis=1) </code></pre> Generic case For the truly generic case of <code>full_array</code> not necessarily being sorted, we need to use <code>sorter</code> arg with <code>searchsorted</code> - <pre class="prettyprint"><code>def isin2D(full_array, sub_arrays): out = np.zeros((sub_arrays.shape[0],len(full_array)),dtype=bool) sidx = full_array.argsort() idx = np.searchsorted(full_array, sub_arrays, sorter=sidx) idx[idx==len(full_array)] = 0 idx0 = sidx[idx] np.put_along_axis(out, idx0, full_array[idx0] == sub_arrays, axis=1) return out </code></pre> Sample run - <pre class="prettyprint"><code>In [214]: full_array Out[214]: array(['E', 'F', 'A', 'B', 'D', 'C'], dtype='|S1') In [215]: sub_arrays Out[215]: array([['Z', 'C', 'F'], ['B', 'C', 'E']], dtype='|S1') In [216]: isin2D(full_array, sub_arrays) Out[216]: array([[False, True, False, False, False, True], [ True, False, False, True, False, True]]) </code></pre>

Using numpy isin element-wise between 2D and 1D arrays

Tags:

python

python-3.x

numpy

array-broadcasting

I have quite a simple scenario where I'd like to test whether both elements of a two-dimensional array are (separately) members of a larger array - for example:

full_array = np.array(['A','B','C','D','E','F'])
sub_arrays = np.array([['A','C','F'],
                       ['B','C','E']])
np.isin(full_array, sub_arrays)

This gives me a single dimension output:

array([ True,  True,  True, False,  True,  True])

showing whether elements of full_array are present in either of the two sub-arrays. I'd like instead a two-dimensional array showing the same thing for each of the two elements in sub_arrays - so:

array([[ True,  False,  True, False,  False,  True],
       [ False, True,   True, False,  True,  False]])

Hope that makes sense and any help gratefully received.

800

asked Dec 05 '18 11:12

Chris J Harris

1 Answers

Broadcasting based one

A simple one would be with broadcasting after extending one of the arrays and then any-reduction along the respective axis -

In [140]: (full_array==sub_arrays[...,None]).any(axis=1)
Out[140]: 
array([[ True, False,  True, False, False,  True],
       [False,  True,  True, False,  True, False]])

With `searchsorted`

Specific case #1

With full_array being sorted and all elements from sub_arrays present at least somewhere in full_array, we can also use np.searchsorted -

idx = np.searchsorted(full_array, sub_arrays)
out = np.zeros((sub_arrays.shape[0],len(full_array)),dtype=bool)
np.put_along_axis(out, idx, 1, axis=1)

Specific case #2

With full_array being sorted and if not all elements from sub_arrays are guaranteed to be present at least somewhere in full_array, we need one extra step -

idx = np.searchsorted(full_array, sub_arrays)
idx[idx==len(full_array)] = 0
out = np.zeros((sub_arrays.shape[0],len(full_array)),dtype=bool)
np.put_along_axis(out, idx, full_array[idx] == sub_arrays, axis=1)

Generic case

For the truly generic case of full_array not necessarily being sorted, we need to use sorter arg with searchsorted -

def isin2D(full_array, sub_arrays):
    out = np.zeros((sub_arrays.shape[0],len(full_array)),dtype=bool)
    sidx = full_array.argsort()
    idx = np.searchsorted(full_array, sub_arrays, sorter=sidx)
    idx[idx==len(full_array)] = 0
    idx0 = sidx[idx]
    np.put_along_axis(out, idx0, full_array[idx0] == sub_arrays, axis=1)
    return out

Sample run -

In [214]: full_array
Out[214]: array(['E', 'F', 'A', 'B', 'D', 'C'], dtype='|S1')

In [215]: sub_arrays
Out[215]: 
array([['Z', 'C', 'F'],
       ['B', 'C', 'E']], dtype='|S1')

In [216]: isin2D(full_array, sub_arrays)
Out[216]: 
array([[False,  True, False, False, False,  True],
       [ True, False, False,  True, False,  True]])

answered Oct 25 '22 00:10

Divakar

Related questions
                            
                                Efficient Method of finding common files between two given paths in Python
                            
                                Asyncio How do you use run_forever?
                            
                                Pandas Merge two rows into a single row based on columns
                            
                                don't understand this lambda expression with defaultdict
                            
                                Most elegant way to assign multiple variables to the same value?
                            
                                Why can tf.image.decode_jpeg decode a png?
                            
                                jinja2.exceptions.TemplateSyntaxError: expected token 'end of print statement', got 'posted'
                            
                                Are random seeds compatible between systems?
                            
                                In SVC from Sklearn, why is the training time not strictly linear to maximum iteration when label size is big?
                            
                                Is it possible to input values for confidence interval/ error bars on seaborn barplot?
                            
                                Selenium generating error "Element is not interactable"
                            
                                Can I force pip to make a shallow checkout when installing from git?
                            
                                Equivalent of thread.interrupt_main() in Python 3
                            
                                Numpy custom Cumsum function with upper/lower limits?
                            
                                How can I change the default font using in django admin interface?
                            
                                "ImportError: Cannot import name multiarray"
                            
                                django channels ImproperlyConfigured: Cannot find 'app' in ASGI_APPLICATION module
                            
                                check if letters of a string are in sequential order in another string
                            
                                Flask WTForms Integerfield type is text instead of number
                            
                                Python - Screenshot of background/inactive window

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using numpy isin element-wise between 2D and 1D arrays

Tags:

python

python-3.x

numpy

array-broadcasting

Chris J Harris

People also ask

1 Answers

Broadcasting based one

With `searchsorted`

Divakar

Recent Activity

Donate For Us

Using numpy isin element-wise between 2D and 1D arrays

Tags:

python

python-3.x

numpy

array-broadcasting

Chris J Harris

People also ask

1 Answers

Broadcasting based one

With searchsorted

Divakar

Related questions

Recent Activity

Donate For Us

With `searchsorted`