Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Python `list.extend(iterator)` guaranteed to be lazy?

Summary

Suppose I have an iterator that, as elements are consumed from it, performs some side effect, such as modifying a list. If I define a list l and call l.extend(iterator), is it guaranteed that extend will push elements onto l one-by-one, as elements from the iterator are consumed, as opposed to kept in a buffer and then pushed on all at once?

My experiments

I did a quick test in Python 3.7 on my computer, and list.extend seems to be lazy based on that test. (See code below.) Is this guaranteed by the spec, and if so, where in the spec is that mentioned?

(Also, feel free to criticize me and say "this is not Pythonic, you fool!"--though I would appreciate it if you also answer the question if you want to criticize me. Part of why I'm asking is for my own curiosity.)

Say I define an iterator that pushes onto a list as it runs:

l = []

def iterator(k):
  for i in range(5):
    print([j in k for j in range(5)])
    yield i

l.extend(iterator(l))

Here are examples of non-lazy (i.e. buffered) vs. lazy possible extend implementations:

def extend_nonlazy(l, iterator):
  l += list(iterator)

def extend_lazy(l, iterator):
  for i in iterator:
    l.append(i)

Results

Here's what happens when I run both known implementations of extend.


Non-lazy:

l = []
extend_nonlazy(l, iterator(l))
# output
[False, False, False, False, False]
[False, False, False, False, False]
[False, False, False, False, False]
[False, False, False, False, False]
[False, False, False, False, False]

# l = [0, 1, 2, 3, 4]

Lazy:

l = []
extend_lazy(l, iterator(l))
[False, False, False, False, False]
[True, False, False, False, False]
[True, True, False, False, False]
[True, True, True, False, False]
[True, True, True, True, False]

My own experimentation shows that native list.extend seems to work like the lazy version, but my question is: does the Python spec guarantee that?

like image 696
Kye W Shi Avatar asked Jul 25 '19 04:07

Kye W Shi


People also ask

Are Python iterators lazy?

Iterators use the lazy evaluation approach. Many built-in classes in Python are iterators. A generator function is a function which returns an iterator.

Does list extend Iterable?

The Python's List extend() method iterates over an iterable like string, list, tuple, etc., and adds each element of the iterable to the end of the List, modifying the original list.

What does list extend do in Python?

Definition and Usage The extend() method adds the specified list elements (or any iterable) to the end of the current list.

Can we extend list to tuple in Python?

1) Using tuple() builtin function tuple () function can take any iterable as an argument and convert it into a tuple object. As you wish to convert a python list to a tuple, you can pass the entire list as a parameter within the tuple() function, and it will return the tuple data type as an output.


2 Answers

I don't think the issue is lazy vs non lazy because, either in slice assignment or in list extend, you need all the elements of the iterator and these elements are consumed at once (in the normal case). The issue you raised is more important: are these operations atomic or not atomic? See one definition of "atomicity" in Wikipedia:

Atomicity guarantees that each transaction is treated as a single "unit", which either succeeds completely, or fails completely.

Have a look at this example (CPython 3.6.8):

>>> def new_iterator(): return (1/(i-2) for i in range(5))
>>> L = []
>>> L[:] = new_iterator()
Traceback (most recent call last):
...
ZeroDivisionError: division by zero
>>> L
[]

The slice assignment failed because of the exception (i == 2 => 1/(i - 2) raises an exception) and the list was left unchanged. Hence, the slice assignement operation is atomic.

Now, the same example with: extend:

>>> L.extend(new_iterator())
Traceback (most recent call last):
...
ZeroDivisionError: division by zero
>>> L
[-0.5, -1.0]

When the exception was raised, the two first elements were already appended to the list. The extend operation is not atomic, since a failure does not leave the list unchanged.

Should the extend operation be atomic or not? Frankly I have no idea about that, but as written in @wim's answer, the real issue is that it's not clearly stated in the documentation (and worse, the documentation asserts that extend is equivalent to the slice assignment, which is not true in the reference implementation).

like image 65
jferard Avatar answered Sep 26 '22 14:09

jferard


Is Python list.extend(iterator) guaranteed to be lazy?

No. On the contrary, it's documented that

l.extend(iterable)

is equivalent to

l[len(l):] = iterable

In CPython, such a slice assignment will first convert a generator on the right hand side into a list anyway (see here), i.e. it's consuming the iterable all at once.

The example shown in your question is, strictly speaking, contradicting the documentation. I've filed a documentation bug, but it was promptly closed by Raymond Hettinger.

As an aside, there are less convoluted ways to demonstrate the discrepancy. Just define a failing generator:

def gen():
    yield 1
    yield 2
    yield 3
    uh-oh

Now L.extend(gen()) will modify L, but L[:] = gen() will not.

like image 36
wim Avatar answered Sep 22 '22 14:09

wim