Suppose I have an iterator
that, as elements are consumed from it, performs some side effect, such as modifying a list. If I define a list l
and call l.extend(iterator)
, is it guaranteed that extend
will push elements onto l
one-by-one, as elements from the iterator are consumed, as opposed to kept in a buffer and then pushed on all at once?
I did a quick test in Python 3.7 on my computer, and list.extend
seems to be lazy based on that test. (See code below.) Is this guaranteed by the spec, and if so, where in the spec is that mentioned?
(Also, feel free to criticize me and say "this is not Pythonic, you fool!"--though I would appreciate it if you also answer the question if you want to criticize me. Part of why I'm asking is for my own curiosity.)
Say I define an iterator that pushes onto a list as it runs:
l = []
def iterator(k):
for i in range(5):
print([j in k for j in range(5)])
yield i
l.extend(iterator(l))
Here are examples of non-lazy (i.e. buffered) vs. lazy possible extend
implementations:
def extend_nonlazy(l, iterator):
l += list(iterator)
def extend_lazy(l, iterator):
for i in iterator:
l.append(i)
Here's what happens when I run both known implementations of extend
.
Non-lazy:
l = []
extend_nonlazy(l, iterator(l))
# output
[False, False, False, False, False]
[False, False, False, False, False]
[False, False, False, False, False]
[False, False, False, False, False]
[False, False, False, False, False]
# l = [0, 1, 2, 3, 4]
Lazy:
l = []
extend_lazy(l, iterator(l))
[False, False, False, False, False]
[True, False, False, False, False]
[True, True, False, False, False]
[True, True, True, False, False]
[True, True, True, True, False]
My own experimentation shows that native list.extend
seems to work like the lazy version, but my question is: does the Python spec guarantee that?
Iterators use the lazy evaluation approach. Many built-in classes in Python are iterators. A generator function is a function which returns an iterator.
The Python's List extend() method iterates over an iterable like string, list, tuple, etc., and adds each element of the iterable to the end of the List, modifying the original list.
Definition and Usage The extend() method adds the specified list elements (or any iterable) to the end of the current list.
1) Using tuple() builtin function tuple () function can take any iterable as an argument and convert it into a tuple object. As you wish to convert a python list to a tuple, you can pass the entire list as a parameter within the tuple() function, and it will return the tuple data type as an output.
I don't think the issue is lazy vs non lazy because, either in slice assignment or in list extend
, you need all the elements of the iterator and these elements are consumed at once (in the normal case). The issue you raised is more important: are these operations atomic or not atomic? See one definition of "atomicity" in Wikipedia:
Atomicity guarantees that each transaction is treated as a single "unit", which either succeeds completely, or fails completely.
Have a look at this example (CPython 3.6.8):
>>> def new_iterator(): return (1/(i-2) for i in range(5))
>>> L = []
>>> L[:] = new_iterator()
Traceback (most recent call last):
...
ZeroDivisionError: division by zero
>>> L
[]
The slice assignment failed because of the exception (i == 2
=> 1/(i - 2)
raises an exception) and the list was left unchanged. Hence, the slice assignement operation is atomic.
Now, the same example with: extend
:
>>> L.extend(new_iterator())
Traceback (most recent call last):
...
ZeroDivisionError: division by zero
>>> L
[-0.5, -1.0]
When the exception was raised, the two first elements were already appended to the list. The extend
operation is not atomic, since a failure does not leave the list unchanged.
Should the extend
operation be atomic or not? Frankly I have no idea about that, but as written in @wim's answer, the real issue is that it's not clearly stated in the documentation (and worse, the documentation asserts that extend
is equivalent to the slice assignment, which is not true in the reference implementation).
Is Python
list.extend(iterator)
guaranteed to be lazy?
No. On the contrary, it's documented that
l.extend(iterable)
is equivalent to
l[len(l):] = iterable
In CPython, such a slice assignment will first convert a generator on the right hand side into a list anyway (see here), i.e. it's consuming the iterable
all at once.
The example shown in your question is, strictly speaking, contradicting the documentation. I've filed a documentation bug, but it was promptly closed by Raymond Hettinger.
As an aside, there are less convoluted ways to demonstrate the discrepancy. Just define a failing generator:
def gen():
yield 1
yield 2
yield 3
uh-oh
Now L.extend(gen())
will modify L
, but L[:] = gen()
will not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With