Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Length of generator output [duplicate]

People also ask

How many times can you iterate through a generator?

This is because generators, like all iterators, can be exhausted. Unless your generator is infinite, you can iterate through it one time only. Once all values have been evaluated, iteration will stop and the for loop will exit.


The easiest way is probably just sum(1 for _ in gen) where gen is your generator.


So, for those who would like to know the summary of that discussion. The final top scores for counting a 50 million-lengthed generator expression using:

  • len(list(gen)),
  • len([_ for _ in gen]),
  • sum(1 for _ in gen),
  • ilen(gen) (from more_itertool),
  • reduce(lambda c, i: c + 1, gen, 0),

sorted by performance of execution (including memory consumption), will make you surprised:

```

1: test_list.py:8: 0.492 KiB

gen = (i for i in data*1000); t0 = monotonic(); len(list(gen))

('list, sec', 1.9684218849870376)

2: test_list_compr.py:8: 0.867 KiB

gen = (i for i in data*1000); t0 = monotonic(); len([i for i in gen])

('list_compr, sec', 2.5885991149989422)

3: test_sum.py:8: 0.859 KiB

gen = (i for i in data*1000); t0 = monotonic(); sum(1 for i in gen); t1 = monotonic()

('sum, sec', 3.441088170016883)

4: more_itertools/more.py:413: 1.266 KiB

d = deque(enumerate(iterable, 1), maxlen=1)

test_ilen.py:10: 0.875 KiB
gen = (i for i in data*1000); t0 = monotonic(); ilen(gen)

('ilen, sec', 9.812256851990242)

5: test_reduce.py:8: 0.859 KiB

gen = (i for i in data*1000); t0 = monotonic(); reduce(lambda counter, i: counter + 1, gen, 0)

('reduce, sec', 13.436614598002052) ```

So, len(list(gen)) is the most frequent and less memory consumable


There isn't one because you can't do it in the general case - what if you have a lazy infinite generator? For example:

def fib():
    a, b = 0, 1
    while True:
        a, b = b, a + b
        yield a

This never terminates but will generate the Fibonacci numbers. You can get as many Fibonacci numbers as you want by calling next().

If you really need to know the number of items there are, then you can't iterate through them linearly one time anyway, so just use a different data structure such as a regular list.


def count(iter):
    return sum(1 for _ in iter)

Or better yet:

def count(iter):
    try:
        return len(iter)
    except TypeError:
        return sum(1 for _ in iter)

If it's not iterable, it will throw a TypeError.

Or, if you want to count something specific in the generator:

def count(iter, key=None):
    if key:
        if callable(key):
            return sum(bool(key(x)) for x in iter)
        return sum(x == key for x in iter)
    try:
        return len(iter)
    except TypeError:
        return sum(1 for _ in iter)