Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python idiom for counting loop execution

If looping over a list/tuple/sequence, you can use len(...) to infer how many times the loop was executed. But when looping over an iterator, you cannot.

[Update for clarity: I am thinking about single-use finite iterators where I want to do computation on the items AND count them at the same time.]

I currently use an explicit counter variable as in the following example:

def some_function(some_argument):
    pass


some_iterator = iter("Hello world")

count = 0
for value in some_iterator:
    some_function(value)
    count += 1

print("Looped %i times" % count)

Given there are 11 characters in "Hello world", the expected output here is:

Looped 11 times

I have also considered this shorter alternative using enumerate(...) but I do not find this as clear:

def some_function(some_argument):
    pass


some_iterator = iter("Hello world")

count = 0  # Added for special case, see note below
for count, value in enumerate(some_iterator, start=1):
    some_function(value)

print("Looped %i times" % count)

[Update for reference: @mata spotted that as originally written this second example would fail if the iterator is empty. Inserting count = 0 solves this, or we can use the for ... else ... structure to handle this corner case.]

It does not use the index from enumerate(...) within the loop, but rather setting the variable to the loop count is almost a side effect. To me this is quite unclear, so I prefer the first version with the explicit increment.

Is there an accepted Pythonic way to do this (ideally for both Python 3 and Python 2 code)?

like image 938
peterjc Avatar asked Feb 13 '17 15:02

peterjc


People also ask

How do you count a loop in Python?

Use the enumerate() function to count in a for loop, e.g. for index, item in enumerate(my_list): . The function takes an iterable and returns an object containing tuples, where the first element is the index, and the second - the item. Copied!

What is the term for one execution of a loop in Python?

for loops are used when you have a block of code which you want to repeat a fixed number of times. The for-loop is always used in combination with an iterable object, like a list or a range. The Python for statement iterates over the members of a sequence in order, executing the block each time.


2 Answers

You can combine the convenience of enumerate with the counter being defined if the loop did not run by adding one line:

count = 0  # Counter is set in any case.
for count, item in enumerate(data, start=1):
   doSomethingTo(item)
print "Did it %d times" % count

If all you need it is to count the number of items in an iterator, without doing anything with the items, and without making a list of them, you can do it simply:

count = sum(1 for ignored_item in data)  # count a 1 for each item
like image 87
9000 Avatar answered Sep 20 '22 10:09

9000


You can do all sorts of stuff to count the number of items in a generator, but in any case, the original generator will be wasted. Exhausted, to be precise.

length = sum(1 for x in gen)
length = max(c for c, _ in enumerate(gen, 1))
length = len(list(gen))
  1. The first way shown here is really nice as it handles the case with an empty generator well and returns zero as expected.
  2. The second code will raise an exception when given an exhausted generator, which may be useful when it should never be exhausted, but it actually is, so the execution will stop and you'll be able to investigate what went wrong.
  3. This code will probably waste a lot of memory if the data provided by gen is too big, but it's just easy to understand, so that nobody will have to think hard banging their head against a wall trying to understand what this means.

All of these will work only for finite generators.

If you want to calculate the 'length' of the iterator while looping over it, you can do this:

length = 0
for length, data in enumerate(gen, 1):
    # do stuff

Now, length will be equal to the number of elements the generator has produced. Notice that you don't have to increment length manually as both length and data are still available and valid after the loop execution.


EDIT: if you want to execute some function for each value and disregard its return value (you can handle it by using a list as one of the function's arguments), you can try this:

length = sum(1 | bool(function(x)) for x in gen)

This will calculate the length while applying function to each element of the generator. Still, using enumerate looks like a better idea.

like image 40
ForceBru Avatar answered Sep 19 '22 10:09

ForceBru