I came across this code snippet for round robin in https://more-itertools.readthedocs.io/en/latest/api.html, but I couldn't understand how the last line nexts = cycle(islice(nexts, pending))
removes the exhausted generator from nexts iterator. To my understanding, this removes the last one from input but shouldn't we remove the first one, i.e. the current one that throws this exception?
def roundrobin(*iterables):
"""Yields an item from each iterable, alternating between them.
>>> list(roundrobin('ABC', 'D', 'EF'))
['A', 'D', 'E', 'B', 'F', 'C']
This function produces the same output as :func:`interleave_longest`, but
may perform better for some inputs (in particular when the number of
iterables is small).
"""
# Recipe credited to George Sakkis
pending = len(iterables)
if PY2:
nexts = cycle(iter(it).next for it in iterables)
else:
nexts = cycle(iter(it).__next__ for it in iterables)
while pending:
try:
for next in nexts:
yield next()
except StopIteration:
pending -= 1
nexts = cycle(islice(nexts, pending))
I'll break it up for you, step by step:
>>> list(roundrobin('ABC', 'D', 'EF'))
pending = 3 # len(('ABC', 'D', 'EF'))
# nexts roughly equivalent to:
['ABC', 'D', 'EF', 'ABC', 'D', 'EF', 'ABC', ...]
***
# the following get yielded
'A' from 'ABC' # nexts: ['D' , 'EF', 'BC', 'D' , 'EF', ...]
'D' from 'D' # nexts: ['EF', 'BC', '' , 'EF', 'BC', ...]
'E' from 'EF' # nexts: ['BC', '' , 'F' , 'BC', '' , ...]
'B' from 'BC' # nexts: ['' , 'F' , 'C' , '' , 'F' , ...]
# StopIteration was raised by what used to be "B"
SI from '' # nexts: ['F', 'C', '', 'F', 'C', '', 'F', ...]
# ^ index 2 (`pending`[see the following line])
# pending -= 1
pending = 2
# when islice() is used with one argument, it defaults to the "stop" index
# so islice() returns ['F', 'C']
# and cycle() converts it to ['F', 'C', 'F', 'C', ...]
pending = 2
nexts: ['F', 'C', 'F', 'C', ...]
# Go back to *** and continue until pending = 0
And that's how the last line removes the exhausted iterable.
Main idea:
for next in nexts:
yield next()
Even if StopIteration
is raised, the exhausted iterable had already been used from nexts
.
So instead of the exhausted iterable remaining as the first item in nexts
:
nexts = ['', 'F', 'C', '', 'F', 'C', '', 'F', ...]
The first item of nexts
is actually the next item:
nexts = ['F', 'C', '', 'F', 'C', '', 'F', ...]
Because cycle()
creates an infinite iterator, you are not always slicing from the "beginning" of iterables
when islice()
is used in the final line. To simplify a bit, nexts
as it is originally defined will "loop back" to the beginning of iterables
:
try:
for next in nexts:
print(next())
except StopIteration:
print('decrementing...')
A
D
E
B
decrementing...
The goal at this point is to remove the exhausted iterator that was formed from 'D'
. The key is that you're now "at" 'EF'
, with the iterator that was formed from 'ABC'
being next in line, so islice(nexts, pending)
will cycle from 'EF'
to 'ABC'
.
Here is a related example:
x = cycle('abcd')
# ^ |
# |__|
for _ in range(3):
x.__next__()
list(islice(x, 2))
# ['d', 'a']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With