Numpy: find first index of value fast

People also ask

How do I get the first element of a NumPy array?

You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

There is a feature request for this scheduled for Numpy 2.0.0: https://github.com/numpy/numpy/issues/2269

Although it is way too late for you, but for future reference: Using numba (1) is the easiest way until numpy implements it. If you use anaconda python distribution it should already be installed. The code will be compiled so it will be fast.

@jit(nopython=True)
def find_first(item, vec):
    """return the index of the first occurence of item in vec"""
    for i in xrange(len(vec)):
        if item == vec[i]:
            return i
    return -1

and then:

>>> a = array([1,7,8,32])
>>> find_first(8,a)
2

I've made a benchmark for several methods:

argwhere
nonzero as in the question
.tostring() as in @Rob Reilink's answer
python loop
Fortran loop

The Python and Fortran code are available. I skipped the unpromising ones like converting to a list.

The results on log scale. X-axis is the position of the needle (it takes longer to find if it's further down the array); last value is a needle that's not in the array. Y-axis is the time to find it.

benchmark results

The array had 1 million elements and tests were run 100 times. Results still fluctuate a bit, but the qualitative trend is clear: Python and f2py quit at the first element so they scale differently. Python gets too slow if the needle is not in the first 1%, whereas f2py is fast (but you need to compile it).

To summarize, f2py is the fastest solution, especially if the needle appears fairly early.

It's not built in which is annoying, but it's really just 2 minutes of work. Add this to a file called search.f90:

subroutine find_first(needle, haystack, haystack_length, index)
    implicit none
    integer, intent(in) :: needle
    integer, intent(in) :: haystack_length
    integer, intent(in), dimension(haystack_length) :: haystack
!f2py intent(inplace) haystack
    integer, intent(out) :: index
    integer :: k
    index = -1
    do k = 1, haystack_length
        if (haystack(k)==needle) then
            index = k - 1
            exit
        endif
    enddo
end

If you're looking for something other than integer, just change the type. Then compile using:

f2py -c -m search search.f90

after which you can do (from Python):

import search
print(search.find_first.__doc__)
a = search.find_first(your_int_needle, your_int_array)

You can convert a boolean array to a Python string using array.tostring() and then using the find() method:

(array==item).tostring().find('\x01')

This does involve copying the data, though, since Python strings need to be immutable. An advantage is that you can also search for e.g. a rising edge by finding \x00\x01

In case of sorted arrays np.searchsorted works.

Related questions
                            
                                How do I pass extra arguments to a Python decorator?
                            
                                Matplotlib discrete colorbar
                            
                                How to set environment variables in PyCharm?
                            
                                How can I use UUIDs in SQLAlchemy?
                            
                                month name to month number and vice versa in python
                            
                                Plot a bar using matplotlib using a dictionary
                            
                                Login credentials not working with Gmail SMTP
                            
                                Check to see if python script is running
                            
                                How to Install pip for python 3.7 on Ubuntu 18?
                            
                                How do I calculate r-squared using Python and Numpy?
                            
                                How do I send a POST request as a JSON?
                            
                                Python: How to determine the language?
                            
                                How to clamp an integer to some range?
                            
                                Cannot display HTML string
                            
                                What is the difference between using loc and using just square brackets to filter for columns in Pandas/Python?
                            
                                Project structure for Google App Engine
                            
                                Flake8: Ignore specific warning for entire file
                            
                                while (1) vs. while(True) -- Why is there a difference (in python 2 bytecode)?
                            
                                What does 'killed' mean when a processing of a huge CSV with Python, which suddenly stops?
                            
                                Nested classes' scope?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Numpy: find first index of value fast

Tags:

python

find

numpy

People also ask

Recent Activity

Donate For Us