I have a large numpy array (<code>dtype=int</code>) and a set of numbers which I'd like to find in that array, e.g., <pre class="prettyprint"><code>import numpy as np values = np.array([1, 2, 3, 1, 2, 4, 5, 6, 3, 2, 1]) searchvals = [3, 1] # result = [0, 2, 3, 8, 10] </code></pre> The <code>result</code> array doesn't have to be sorted. Speed is an issue, and since both <code>values</code> and <code>searchvals</code> can be large, <pre class="prettyprint"><code>for searchval in searchvals: np.where(values == searchval)[0] </code></pre> doesn't cut it. Any hints?

Is this fast enough? <pre class="prettyprint"><code>>>> np.where(np.in1d(values, searchvals)) (array([ 0, 2, 3, 8, 10]),) </code></pre>

I would say using <code>np.in1d</code> would be the intuitive solution to solve such a case. Having said that, based on <code>this solution</code> here's an alternative with <code>np.searchsorted</code> - <pre class="prettyprint"><code>sidx = np.argsort(searchvals) left_idx = np.searchsorted(searchvals,values,sorter=sidx,side='left') right_idx = np.searchsorted(searchvals,values,sorter=sidx,side='right') out = np.where(left_idx != right_idx)[0] </code></pre>

Numpy int array: Find indices of multiple target ints

Tags:

python

arrays

numpy

I have a large numpy array (dtype=int) and a set of numbers which I'd like to find in that array, e.g.,

import numpy as np
values = np.array([1, 2, 3, 1, 2, 4, 5, 6, 3, 2, 1])
searchvals = [3, 1]
# result = [0, 2, 3, 8, 10]

The result array doesn't have to be sorted.

Speed is an issue, and since both values and searchvals can be large,

for searchval in searchvals:
    np.where(values == searchval)[0]

doesn't cut it.

Any hints?

585

asked Jul 08 '16 12:07

Nico Schlömer

2 Answers

Is this fast enough?

>>> np.where(np.in1d(values, searchvals))
(array([ 0,  2,  3,  8, 10]),)

answered Sep 21 '22 09:09

wim

I would say using np.in1d would be the intuitive solution to solve such a case. Having said that, based on this solution here's an alternative with np.searchsorted -

sidx = np.argsort(searchvals)
left_idx = np.searchsorted(searchvals,values,sorter=sidx,side='left')
right_idx = np.searchsorted(searchvals,values,sorter=sidx,side='right')
out = np.where(left_idx != right_idx)[0]

answered Sep 21 '22 09:09

Divakar

Related questions
                            
                                RuntimeError: module compiled against API version 0xa but this version of numpy is 0x9
                            
                                Python Read Fortran Binary File
                            
                                Python comments Fail using """ or ''' in dictionary [duplicate]
                            
                                Area intersection in Python
                            
                                Keras/Tensorflow predict: error in array shape
                            
                                Accessing a variable of the another program in C
                            
                                Python: reading 12 bit packed binary image
                            
                                ValueError: After pruning, no terms remain. Try a lower min_df or a higher max_df
                            
                                Replace values in column of Pandas DataFrame using a Series lookup table
                            
                                Python: accept unicode strings as regular strings in doctests
                            
                                How can I asyncio schedule a filesystem stat operation?
                            
                                How to Make a Portable Jupyter Slideshow
                            
                                Django F doesn't seem to work?
                            
                                Splash lua script to do multiple clicks and visits
                            
                                Jupyter & IPython: What does %matplotlib inline do?
                            
                                PySpark Evaluation
                            
                                Can we make correlated queries with SQLAlchemy
                            
                                Assigning (instead of defining) a __getitem__ magic method breaks indexing [duplicate]
                            
                                Can't install datasets package via pip
                            
                                Processing large XLSX file in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With