This is because generators, like all iterators, can be exhausted. Unless your generator is infinite, you can iterate through it one time only. Once all values have been evaluated, iteration will stop and the for loop will exit.
The easiest way is probably just sum(1 for _ in gen)
where gen is your generator.
So, for those who would like to know the summary of that discussion. The final top scores for counting a 50 million-lengthed generator expression using:
len(list(gen))
, len([_ for _ in gen])
, sum(1 for _ in gen),
ilen(gen)
(from more_itertool), reduce(lambda c, i: c + 1, gen, 0)
, sorted by performance of execution (including memory consumption), will make you surprised:
```
gen = (i for i in data*1000); t0 = monotonic(); len(list(gen))
('list, sec', 1.9684218849870376)
gen = (i for i in data*1000); t0 = monotonic(); len([i for i in gen])
('list_compr, sec', 2.5885991149989422)
gen = (i for i in data*1000); t0 = monotonic(); sum(1 for i in gen); t1 = monotonic()
('sum, sec', 3.441088170016883)
d = deque(enumerate(iterable, 1), maxlen=1)
test_ilen.py:10: 0.875 KiB
gen = (i for i in data*1000); t0 = monotonic(); ilen(gen)
('ilen, sec', 9.812256851990242)
gen = (i for i in data*1000); t0 = monotonic(); reduce(lambda counter, i: counter + 1, gen, 0)
('reduce, sec', 13.436614598002052) ```
So, len(list(gen))
is the most frequent and less memory consumable
There isn't one because you can't do it in the general case - what if you have a lazy infinite generator? For example:
def fib():
a, b = 0, 1
while True:
a, b = b, a + b
yield a
This never terminates but will generate the Fibonacci numbers. You can get as many Fibonacci numbers as you want by calling next()
.
If you really need to know the number of items there are, then you can't iterate through them linearly one time anyway, so just use a different data structure such as a regular list.
def count(iter):
return sum(1 for _ in iter)
Or better yet:
def count(iter):
try:
return len(iter)
except TypeError:
return sum(1 for _ in iter)
If it's not iterable, it will throw a TypeError
.
Or, if you want to count something specific in the generator:
def count(iter, key=None):
if key:
if callable(key):
return sum(bool(key(x)) for x in iter)
return sum(x == key for x in iter)
try:
return len(iter)
except TypeError:
return sum(1 for _ in iter)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With