I get different output when using a list comprehension versus a generator comprehension. Is this expected behavior or a bug? Consider the following setup: <pre class="prettyprint"><code>all_configs = [ {'a': 1, 'b':3}, {'a': 2, 'b':2} ] unique_keys = ['a','b'] </code></pre> If I then run the following code, I get: <pre class="prettyprint"><code>print(list(zip(*( [c[k] for k in unique_keys] for c in all_configs)))) >>> [(1, 2), (3, 2)] # note the ( vs [ print(list(zip(*( (c[k] for k in unique_keys) for c in all_configs)))) >>> [(2, 2), (2, 2)] </code></pre> This is on python 3.6.0: <pre class="prettyprint"><code>Python 3.6.0 (default, Dec 24 2016, 08:01:42) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin </code></pre>

In a list comprehension, expressions are evaluated eagerly. In a generator expression, they are only looked up as needed. Thus, as the generator expression iterates over <code>for c in all_configs</code>, it refers to <code>c[k]</code> but only looks up <code>c</code> after the loop is done, so it only uses the latest value for both tuples. By contrast, the list comprehension is evaluated immediately, so it creates a tuple with the first value of <code>c</code> and another tuple with the second value of <code>c</code>. Consider this small example: <pre class="prettyprint"><code>>>> r = range(3) >>> i = 0 >>> a = [i for _ in r] >>> b = (i for _ in r) >>> i = 3 >>> print(*a) 0 0 0 >>> print(*b) 3 3 3 </code></pre> When creating <code>a</code>, the interpreter created that list immediately, looking up the value of <code>i</code> as soon as it was evaluated. When creating <code>b</code>, the interpreter just set up that generator and didn't actually iterate over it and look up the value of <code>i</code>. The <code>print</code> calls told the interpreter to evaluate those objects. <code>a</code> already existed as a full list in memory with the old value of <code>i</code>, but <code>b</code> was evaluated at that point, and when it looked up the value of <code>i</code>, it found the new value.

To see what's going on, replace <code>c[k]</code> with a function with a side effect: <pre class="prettyprint"><code>def f(c,k): print(c,k) return c[k] print("listcomp") print(list(zip(*( [f(c,k) for k in unique_keys] for c in all_configs)))) print("gencomp") print(list(zip(*( (f(c,k) for k in unique_keys) for c in all_configs)))) </code></pre> output: <pre class="prettyprint"><code>listcomp {'a': 1, 'b': 3} a {'a': 1, 'b': 3} b {'a': 2, 'b': 2} a {'a': 2, 'b': 2} b [(1, 2), (3, 2)] gencomp {'a': 2, 'b': 2} a {'a': 2, 'b': 2} a {'a': 2, 'b': 2} b {'a': 2, 'b': 2} b [(2, 2), (2, 2)] </code></pre> <code>c</code> in generator expressions is evaluated after the outer loop has completed: <code>c</code> bears the last value it took in the outer loop. In the list comprehension case, <code>c</code> is evaluated at once. (note that <code>aabb</code> vs <code>abab</code> too because of execution when zipping vs execution at once) note that you can keep the "generator" way of doing it (not creating the temporary list) by passing <code>c</code> to <code>map</code> so the current value is stored: <pre class="prettyprint"><code>print(list(zip(*( map(c.get,unique_keys) for c in all_configs)))) </code></pre> in Python 3, <code>map</code> does not create a <code>list</code>, but the result is still OK: <code>[(1, 2), (3, 2)]</code>

This is happening because <code>zip(*)</code> call resulted in evaluation of the outer generator and this outer returned two more generators. <pre class="prettyprint"><code>(c[k], print(c)) for k in unique_keys) </code></pre> The evaluation of outer generator moved <code>c</code> to the second dict: <code>{'a': 2, 'b':2}</code>. Now when we are evaluating these generators individually they look for <code>c</code> somewhere, and as its value is now <code>{'a': 2, 'b':2}</code> you get the output as <code>[(2, 2), (2, 2)]</code>. Demo: <pre class="prettyprint"><code>>>> def my_zip(*args): ... print(args) ... for arg in args: ... print (list(arg)) ... ... my_zip(*((c[k] for k in unique_keys) for c in all_configs)) ... </code></pre> Output: <pre class="prettyprint"><code># We have two generators now, means it has looped through `all_configs`. (<generator object <genexpr>.<genexpr> at 0x104415c50>, <generator object <genexpr>.<genexpr> at 0x10416b1a8>) [2, 2] [2, 2] </code></pre> <hr> The list-comprehension on the other hand evaluates right away and can fetch the value of current value of <code>c</code> not its last value. <hr> <h3>How to force it use the correct value of <code>c</code>?</h3> Use a inner function and generator function. The inner function can help us remember <code>c</code>'s value using default argument. <pre class="prettyprint"><code>>>> def solve(): ... for c in all_configs: ... def func(c=c): ... return (c[k] for k in unique_keys) ... yield func() ... >>> >>> list(zip(*solve())) [(1, 2), (3, 2)] </code></pre>

Generator Comprehension different output from list comprehension?

Tags:

python

I get different output when using a list comprehension versus a generator comprehension. Is this expected behavior or a bug?

Consider the following setup:

all_configs = [
    {'a': 1, 'b':3},
    {'a': 2, 'b':2}
]
unique_keys = ['a','b']

If I then run the following code, I get:

print(list(zip(*( [c[k] for k in unique_keys] for c in all_configs))))
>>> [(1, 2), (3, 2)]
# note the ( vs [
print(list(zip(*( (c[k] for k in unique_keys) for c in all_configs))))
>>> [(2, 2), (2, 2)]

This is on python 3.6.0:

Python 3.6.0 (default, Dec 24 2016, 08:01:42)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin

378

asked Mar 15 '17 09:03

Bas

3 Answers

In a list comprehension, expressions are evaluated eagerly. In a generator expression, they are only looked up as needed.

Thus, as the generator expression iterates over for c in all_configs, it refers to c[k] but only looks up c after the loop is done, so it only uses the latest value for both tuples. By contrast, the list comprehension is evaluated immediately, so it creates a tuple with the first value of c and another tuple with the second value of c.

Consider this small example:

>>> r = range(3)
>>> i = 0
>>> a = [i for _ in r]
>>> b = (i for _ in r)
>>> i = 3
>>> print(*a)
0 0 0
>>> print(*b)
3 3 3

When creating a, the interpreter created that list immediately, looking up the value of i as soon as it was evaluated. When creating b, the interpreter just set up that generator and didn't actually iterate over it and look up the value of i. The print calls told the interpreter to evaluate those objects. a already existed as a full list in memory with the old value of i, but b was evaluated at that point, and when it looked up the value of i, it found the new value.

174

answered Oct 07 '22 19:10

TigerhawkT3

To see what's going on, replace c[k] with a function with a side effect:

def f(c,k):
    print(c,k)
    return c[k]
print("listcomp")
print(list(zip(*( [f(c,k) for k in unique_keys] for c in all_configs))))
print("gencomp")
print(list(zip(*( (f(c,k) for k in unique_keys) for c in all_configs))))

output:

listcomp
{'a': 1, 'b': 3} a
{'a': 1, 'b': 3} b
{'a': 2, 'b': 2} a
{'a': 2, 'b': 2} b
[(1, 2), (3, 2)]
gencomp
{'a': 2, 'b': 2} a
{'a': 2, 'b': 2} a
{'a': 2, 'b': 2} b
{'a': 2, 'b': 2} b
[(2, 2), (2, 2)]

c in generator expressions is evaluated after the outer loop has completed:

c bears the last value it took in the outer loop.

In the list comprehension case, c is evaluated at once.

(note that aabb vs abab too because of execution when zipping vs execution at once)

note that you can keep the "generator" way of doing it (not creating the temporary list) by passing c to map so the current value is stored:

print(list(zip(*( map(c.get,unique_keys) for c in all_configs))))

in Python 3, map does not create a list, but the result is still OK: [(1, 2), (3, 2)]

answered Oct 07 '22 19:10

Jean-François Fabre

This is happening because zip(*) call resulted in evaluation of the outer generator and this outer returned two more generators.

(c[k], print(c)) for k in unique_keys)

The evaluation of outer generator moved c to the second dict: {'a': 2, 'b':2}.

Now when we are evaluating these generators individually they look for c somewhere, and as its value is now {'a': 2, 'b':2} you get the output as [(2, 2), (2, 2)].

Demo:

>>> def my_zip(*args):
...     print(args)
...     for arg in args:
...         print (list(arg))
...
... my_zip(*((c[k] for k in unique_keys) for c in all_configs))
...

Output:

# We have two generators now, means it has looped through `all_configs`.
(<generator object <genexpr>.<genexpr> at 0x104415c50>, <generator object <genexpr>.<genexpr> at 0x10416b1a8>)
[2, 2]
[2, 2]

The list-comprehension on the other hand evaluates right away and can fetch the value of current value of c not its last value.

How to force it use the correct value of `c`?

Use a inner function and generator function. The inner function can help us remember c's value using default argument.

>>> def solve():
...     for c in all_configs:
...         def func(c=c):
...             return (c[k] for k in unique_keys)
...         yield func()
...

>>>

>>> list(zip(*solve()))
[(1, 2), (3, 2)]

answered Oct 07 '22 18:10

Ashwini Chaudhary

Related questions
                            
                                Plot a histogram such that the total area of the histogram equals 1 (density)
                            
                                Create DataFrame from multiple Series
                            
                                How to tell flake8 to ignore comments
                            
                                Python Seaborn jointplot does not show the correlation coefficient and p-value on the chart
                            
                                Getting HTTP GET arguments in Python
                            
                                Are Python inner functions compiled?
                            
                                How to convert a namedtuple into a list of values and preserving the order of properties?
                            
                                Having both single and double quotation in a Python string
                            
                                Function of Numpy Array with if-statement
                            
                                How To Reduce Python Script Memory Usage
                            
                                How to scale axes in mplot3d
                            
                                Python Parse CSV Correctly
                            
                                Invalidate an old session in Flask
                            
                                Python: Split a list into sub-lists based on index ranges
                            
                                Use Python to find out if a timezone currently in daylight savings time [duplicate]
                            
                                Check if a OneToOne relation exists in Django
                            
                                Tee does not show output or write to file
                            
                                Send keys control + click in Selenium with Python bindings
                            
                                Accessing NumPy array elements not in a given index list
                            
                                How to substract a single value from column of pandas DataFrame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Generator Comprehension different output from list comprehension?

Tags:

python

Bas

People also ask

3 Answers

TigerhawkT3

Jean-François Fabre

How to force it use the correct value of `c`?

Ashwini Chaudhary

Recent Activity

Donate For Us

Generator Comprehension different output from list comprehension?

Tags:

python

Bas

People also ask

3 Answers

TigerhawkT3

Jean-François Fabre

How to force it use the correct value of c?

Ashwini Chaudhary

Related questions

Recent Activity

Donate For Us

How to force it use the correct value of `c`?