Find the index of the n'th item in a list

Tags:

I want to find the index of the n'th occurrence of an item in a list. e.g.,

x=[False,True,True,False,True,False,True,False,False,False,True,False,True]

What is the index of the n'th true? If I wanted the fifth occurrence (4th if zero-indexed), the answer is 10.

I've come up with:

indargs = [ i for i,a in enumerate(x) if a ]
indargs[n]

Note that x.index returns the first occurrence or the first occurrence after some point, and therefore as far as I can tell is not a solution.

There is also a solution in numpy for cases similar to the above, e.g. using cumsum and where, but I'd like to know if there's a numpy-free way to solve the problem.

I'm concerned about performance since I first encountered this while implemented a Sieve of Eratosthenes for a Project Euler problem, but this is a more general question that I have encountered in other situations.

EDIT: I've gotten a lot of great answers, so I decided to do some performance tests. Below are timeit execution times in seconds for lists with len nelements searching for the 4000'th/1000'th True. The lists are random True/False. Source code linked below; it's a touch messy. I used short / modified versions of the posters' names to describe the functions except listcomp, which is the simple list comprehension above.

True Test (100'th True in a list containing True/False)
         nelements      eyquem_occur eyquem_occurrence            graddy            taymon          listcomp       hettinger26         hettinger
             3000:          0.007824          0.031117          0.002144          0.007694          0.026908          0.003563          0.003563
            10000:          0.018424          0.103049          0.002233          0.018063          0.088245          0.003610          0.003769
            50000:          0.078383          0.515265          0.002140          0.078074          0.442630          0.003719          0.003608
           100000:          0.152804          1.054196          0.002129          0.152691          0.903827          0.003741          0.003769
           200000:          0.303084          2.123534          0.002212          0.301918          1.837870          0.003522          0.003601
True Test (1000'th True in a list containing True/False)
         nelements      eyquem_occur eyquem_occurrence            graddy            taymon          listcomp       hettinger26         hettinger
             3000:          0.038461          0.031358          0.024167          0.039277          0.026640          0.035283          0.034482
            10000:          0.049063          0.103241          0.024120          0.049383          0.088688          0.035515          0.034700
            50000:          0.108860          0.516037          0.023956          0.109546          0.442078          0.035269          0.035373
           100000:          0.183568          1.049817          0.024228          0.184406          0.906709          0.035135          0.036027
           200000:          0.333501          2.141629          0.024239          0.333908          1.826397          0.034879          0.036551
True Test (20000'th True in a list containing True/False)
         nelements      eyquem_occur eyquem_occurrence            graddy            taymon          listcomp       hettinger26         hettinger
             3000:          0.004520          0.004439          0.036853          0.004458          0.026900          0.053460          0.053734
            10000:          0.014925          0.014715          0.126084          0.014864          0.088470          0.177792          0.177716
            50000:          0.766154          0.515107          0.499068          0.781289          0.443654          0.707134          0.711072
           100000:          0.837363          1.051426          0.501842          0.862350          0.903189          0.707552          0.706808
           200000:          0.991740          2.124445          0.498408          1.008187          1.839797          0.715844          0.709063
Number Test (750'th 0 in a list containing 0-9)
         nelements      eyquem_occur eyquem_occurrence            graddy            taymon          listcomp       hettinger26         hettinger
             3000:          0.026996          0.026887          0.015494          0.030343          0.022417          0.026557          0.026236
            10000:          0.037887          0.089267          0.015839          0.040519          0.074941          0.026525          0.027057
            50000:          0.097777          0.445236          0.015396          0.101242          0.371496          0.025945          0.026156
           100000:          0.173794          0.905993          0.015409          0.176317          0.762155          0.026215          0.026871
           200000:          0.324930          1.847375          0.015506          0.327957          1.536012          0.027390          0.026657

Hettinger's itertools solution is almost always the best. taymon's and graddy's solutions are next best for most situations, though the list comprehension approach can be better for short arrays when you want the n'th instance such that n is high or lists in which there are fewer than n occurrences. If there's a chance that there are fewer than n occurrences, the initial count check saves time. Also, graddy's is more efficient when searching for numbers instead of True/False... not clear why that is. eyquem's solutions are essentially equivalent to others with slightly more or less overhead; eyquem_occur is approximately the same as taymon's solution, while eyquem_occurrence is similar to listcomp.

307

asked Dec 01 '11 05:12

keflavich

6 Answers

The answer from @Taymon using list.index was great.

FWIW, here's a functional approach using the itertools module. It works with any iterable input, not just lists:

>>> from itertools import compress, count, imap, islice
>>> from functools import partial
>>> from operator import eq

>>> def nth_item(n, item, iterable):
        indicies = compress(count(), imap(partial(eq, item), iterable))
        return next(islice(indicies, n, None), -1)

The example is nice because it shows off how to effectively combine Python's functional toolset. Note, that once the pipeline is set-up, there are no trips around Python's eval loop -- everything gets done at C speed, with a tiny memory footprint, with lazy evaluation, with no variable assignments, and with separately testable components. IOW, it is everything functional programmers dream about :-)

Sample run:

>>> x = [False,True,True,False,True,False,True,False,False,False,True,False,True]
>>> nth_item(50, True, x)
-1
>>> nth_item(0, True, x)
1
>>> nth_item(1, True, x)
2
>>> nth_item(2, True, x)
4
>>> nth_item(3, True, x)
6

176

answered Oct 23 '22 18:10

Raymond Hettinger

I can't say for certain that this is the fastest way, but I imagine it'd be pretty good:

i = -1
for j in xrange(n):
    i = x.index(True, i + 1)

The answer is i.

answered Oct 23 '22 16:10

Taymon

[y for y in enumerate(x) if y[1]==True][z][0]

Note: Here Z is the n'th occurance,

answered Oct 23 '22 18:10

avasal

If you're concerned with performance, you are best off seeing if there are algorithmic optimizations you can make. For example, if you are calling this function many times on the same values, you may wish to cache previous computations (e.g. once you find the 50th occurrence of an element, you can find any previous occurrence in O(1) time).

Otherwise, you want to make sure your technique works on (lazy) iterators.

The most *in*elegant and performance-happy way I can think of implementing it is as:

def indexOfNthOccurrence(N, element, stream):
    """for N>0, returns index or None"""
    seen = 0
    for i,x in enumerate(stream):
        if x==element:
            seen += 1
            if seen==N:
                return i

(if you really care about the performance difference between enumerate and other techniques, you will need to resort to profiling, especially with the numpy functions, which may resort to C)

To preprocess the entire stream and support O(1) queries:

from collections import *
cache = defaultdict(list)
for i,elem in enumerate(YOUR_LIST):
    cache[elem] += [i]

# e.g. [3,2,3,2,5,5,1]
#       0 1 2 3 4 5 6
# cache: {3:[0,2], 1:[6], 2:[1,3], 5:[4,5]}

answered Oct 23 '22 16:10

ninjagecko

A solution that first creates a list object and returns the nth-1 element of this list : function occurence()

And a solution that fulfill functional programmers'dreams too, I think, using generators, because I love them : function occur()

S = 'stackoverflow.com is a fantastic amazing site'
print 'object S is string %r' % S
print "indexes of 'a' in S :",[indx for indx,elem in enumerate(S) if elem=='a']

def occurence(itrbl,x,nth):
    return [indx for indx,elem in enumerate(itrbl)
            if elem==x ][nth-1] if x in itrbl \
           else None

def occur(itrbl,x,nth):
    return (i for pos,i in enumerate(indx for indx,elem in enumerate(itrbl)
                                     if elem==x)
            if pos==nth-1).next() if x in itrbl\
            else   None

print "\noccurence(S,'a',4th) ==",occurence(S,'a',4)
print "\noccur(S,'a',4th) ==",occur(S,'a',4)

result

object S is string 'stackoverflow.com is a fantastic amazing site'
indexes of 'a' in S : [2, 21, 24, 27, 33, 35]

occur(S,'a',4th) == 27

occurence(S,'a',4th) == 27

The second solution seems complex but it isn't really. It doesn't need to run completely through the iterable: the process stops as soon as the wanted occurence is found.

answered Oct 23 '22 18:10

eyquem

Here is another way to find the nth occurrence of x in a list itrbl:

def nthoccur(nth,x,itrbl):
    count,index = 0,0
    while count < nth:
        if index > len(itrbl) - 1:
            return None
        elif itrbl[index] == x:
            count += 1
            index += 1
        else:
            index += 1
    return index - 1

answered Oct 23 '22 18:10

apolune

Related questions
                            
                                Why does Python have an __ne__ operator method instead of just __eq__?
                            
                                Read .csv file from URL into Python 3.x - _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
                            
                                Adding a new column in Data Frame derived from other columns (Spark)
                            
                                Python test to check instance type
                            
                                How to create an OrderedDict in Python?
                            
                                Using f-string with format depending on a condition
                            
                                How does Dropbox use Python on Windows and OS X?
                            
                                Python - Facebook API - Need a working example
                            
                                In Python what is a global statement?
                            
                                How to pickle and unpickle to portable string in Python 3
                            
                                How can I install multiple versions of Python on latest OS X and use them in parallel?
                            
                                Pytorch softmax: What dimension to use?
                            
                                What is the difference between Python vs Jython vs IronPython vs wxPython?
                            
                                How to read from a zip file within zip file in Python? [duplicate]
                            
                                How to install MySQLdb package? (ImportError: No module named setuptools)
                            
                                Django - use reverse url mapping in settings
                            
                                Django Multiple Choice Field / Checkbox Select Multiple
                            
                                File mode for creating+reading+appending+binary
                            
                                Check for presence of a sliced list in Python
                            
                                Setting different reply-to message in Python email/smtplib

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find the index of the n'th item in a list

Tags:

performance

python

arrays

indexing

numpy

keflavich

People also ask

6 Answers

Raymond Hettinger

Taymon

avasal

ninjagecko

eyquem

apolune

Recent Activity

Donate For Us