Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count the items in a generator consumed by other code

I'm creating a generator that gets consumed by another function, but I'd still like to know how many items were generated:

lines = (line.rstrip('\n') for line in sys.stdin)
process(lines)
print("Processed {} lines.".format( ? ))

The best I can come up with is to wrap the generator with a class that keeps a count, or maybe turn it inside out and send() things in. Is there an elegant and efficient way to see how many items a generator produced when you're not the one consuming it in Python 2?

Edit: Here's what I ended up with:

class Count(Iterable):
    """Wrap an iterable (typically a generator) and provide a ``count``
    field counting the number of items.

    Accessing the ``count`` field before iteration is finished will
    invalidate the count.
    """
    def __init__(self, iterable):
        self._iterable = iterable
        self._counter = itertools.count()

    def __iter__(self):
        return itertools.imap(operator.itemgetter(0), itertools.izip(self._iterable, self._counter))

    @property
    def count(self):
        self._counter = itertools.repeat(self._counter.next())
        return self._counter.next()
like image 445
Jay Hacker Avatar asked Jun 10 '11 16:06

Jay Hacker


2 Answers

If you don't care that you are consuming the generator, you can just do:

sum(1 for x in gen)
like image 172
PaulMcG Avatar answered Oct 27 '22 02:10

PaulMcG


If you don't need to return the count and just want to log it, you can use a finally block:

def generator():
    i = 0
    try:
        for x in range(10):
            i += 1
            yield x
    finally:
        print '{} iterations'.format(i)

[ n for n in generator() ]

Which produces:

10 iterations
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
like image 27
Bracken Avatar answered Oct 27 '22 02:10

Bracken