Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Square braces not required in list comprehensions when used in a function

I submitted a pull request with this code:

my_sum = sum([x for x in range(10)])

One of the reviewers suggested this instead:

my_sum = sum(x for x in range(10))

(the difference is just that the square braces are missing).

I was surprised that the second form seems to be identical. But when I tried to use it in other contexts where the first one works, it fails:

y = x for x in range(10)
        ^ SyntaxError !!!

Are the two forms identical? Is there any important reason for why the square braces aren't necessary in the function? Or is this just something that I have to know?

like image 376
Matt Fenwick Avatar asked Jun 12 '12 14:06

Matt Fenwick


1 Answers

This is a generator expression. To get it to work in the standalone case, use braces:

y = (x for x in range(10))

and y becomes a generator. You can iterate over generators, so it works where an iterable is expected, such as the sum function.

Usage examples and pitfalls:

>>> y = (x for x in range(10))
>>> y
<generator object <genexpr> at 0x0000000001E15A20>
>>> sum(y)
45

Be careful when keeping generators around, you can only go through them once. So after the above, if you try to use sum again, this will happen:

>>> sum(y)
0

So if you pass a generator where actually a list or a set or something similar is expected, you have to be careful. If the function or class stores the argument and tries to iterate over it multiple times, you will run into problems. For example consider this:

def foo(numbers):
    s = sum(numbers)
    p = reduce(lambda x,y: x*y, numbers, 1)
    print "The sum is:", s, "and the product:", p

it will fail if you hand it a generator:

>>> foo(x for x in range(1, 10))
The sum is: 45 and the product: 1

You can easily get a list from the values a generator produces:

>>> y = (x for x in range(10))
>>> list(y)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

You can use this to fix the previous example:

>>> foo(list(x for x in range(1, 10)))
The sum is: 45 and the product: 362880

However keep in mind that if you build a list from a generator, you will need to store every value. This might use a lot more memory in situations where you have lots of items.

Why use a generator in your situation?

The much lower memory consumption is the reason why sum(generator expression) is better than sum(list): The generator version only has to store a single value, while the list-variant has to store N values. Therefore you should always use a generator where you don't risk side-effects.

like image 193
mensi Avatar answered Oct 24 '22 03:10

mensi