Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpected results when comparing list comprehension with generator expression [duplicate]

I think I'm overlooking something simple, but I can't seem to figure out what exactly. Please consider the following code:

a = [2, 3, 4, 5]

lc = [ x for x in a if x >= 4 ] # List comprehension
lg = ( x for x in a if x >= 4 ) # Generator expression

a.extend([6,7,8,9])

for i in lc:
    print("{} ".format(i), end="")

for i in lg:
    print("{} ".format(i), end="")

I expected that both for-loops would produce the same result, so 4 5. However, the for-loop that prints the generator exp prints 4 5 6 7 8 9. I think it has something to do with the declaration of the list comprehension (Which is declared before the extend). But why is the result of the generator different, as it is also declared before extending the list? E.g. what is going on internally?

like image 410
Psychotechnopath Avatar asked Oct 19 '19 18:10

Psychotechnopath


3 Answers

Generators aren't evaluated until you call next() on them which is what makes them useful, while list comprehensions are evaluated immediately.

So lc = [4,5] before extend and is therefore done.

lg is still the same value at the start so the extend still applies to the a which hasn't finished being evaluated within the generator, meaning that a gets extended before you start printing it which is why it will print out longer with the rest of the numbers as well.

Check it out like this:

>>> a = [2, 3, 4, 5]
>>> lg = ( x for x in a if x >= 4 )
>>> next(lg)
4
>>> next(lg)
5
>>> a.extend([6,7,8,9])
>>> next(lg)
6

However, if you were to try calling an extra next() before extend you'll get StopIteration because the generator is exhausted at that point and then you won't be able to call it any longer.

>>> a = [2, 3, 4, 5]
>>> lg = ( x for x in a if x >= 4 )
>>> next(lg)
4
>>> next(lg)
5
>>> next(lg)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>> a.extend([6,7,8,9])
>>> next(lg)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
like image 100
MyNameIsCaleb Avatar answered Oct 20 '22 22:10

MyNameIsCaleb


what is going on internally?

Generators are inherently lazy.

[ x for x in a if x >= 4 ] is evaluated as soon as it is executed.

( x for x in a if x >= 4 ) when this executes it just creates the generator. The loops itself is only evaluated/executed when the generator is consumed in one of the many ways possible ('manually' calling next, converting to another iterable type [list, tuple, set etc] or with a for loop).

The main advantage of generators being lazy is memory consumption. They do not need to store all the elements in memory, but only the current (or next, I should say) element.

like image 28
DeepSpace Avatar answered Oct 20 '22 23:10

DeepSpace


The generator expression is lazily evaluated, so when you get back the generator object the code x for x in a if x >= 4 is not yet executed.

The for-in loop internally calls the built-in next() function for each iteration of the loop for that generator object. The next() call actually evaluates the code and that code points to the updated list which has the new set of values you added after the generator object was created.

>>> lg = ( x for x in a if x >= 4)
#evaluates the code and returns the first value
>>> next(lg) 
4
>>> next(lg)
5
# if new values are added here to the list 
# the generator will return them

But in the case of the list comprehension the generator object's next() method is immediately invoked and all the values are added in a list container using the values which was there in the beginning.

The built-in list() and the [] takes an iterable object as a parameter and constructs a list with the values returned from the iterable object. This happens immediately when you pass the iterable (in your case the generator object which is an iterable) to the list constructor.

But on the other hand if you simply execute the generator expression, you just get back the generator object which is just an iterable and also an iterator. So either you need to call next() on it to execute the code and get the value or use it in a for in iterable: loop which does it implicitly.

But remember once you exhaust the generator object by getting a StopIteration exception, and you add a new value in the list that value won't be returned from the next() call as the generator object can be consumed only once.

>>> a = [2, 3, 4, 5]
>>> lg = ( x for x in a if x >= 4)
>>> next(lg)
4
>>> next(lg)
5
>>> a.append(9)
>>> next(lg)
9
>>> next(lg)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
# lg is consumed
>>> a.append(10)
>>> next(lg)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
like image 1
Fullstack Guy Avatar answered Oct 20 '22 21:10

Fullstack Guy