Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I alias expressions inside Python list comprehensions to prevent them being evaluated multiple times?

I find myself often wanting to write Python list comprehensions like this:

nearbyPoints = [(n, delta(n,x)) for n in allPoints if delta(n,x)<=radius]

That hopefully gives some context as to why I would want to do this, but there are also cases where multiple values need to be computed/compared per element:

newlist = [(x,f(x),g(f(x))) for x in bigList if f(x)<p and g(f(x))<q]

So I have two questions:

  1. will all those functions be evaluated multiple times or is the result cached? Does the language specify or is it implementation-specific? I'm using 2.6 now, but would 3.x be different?
  2. is there a neater way to write it? Sometimes f and g are long expressions and duplication is error prone and looks messy. I would really like to be able to write this:
newList = [(x,a=f(x),b=g(a)) for x in bigList if a<p and b<q]

but that doesn't work. Is there a good reason for not supporting this syntax? Can it be done via something like this? Or would I just have to use multiple listcomps or a for-loop?

like image 345
krashalot Avatar asked Jan 30 '11 00:01

krashalot


People also ask

Can list comprehensions be nested?

As it turns out, you can nest list comprehensions within another list comprehension to further reduce your code and make it easier to read still. As a matter of fact, there's no limit to the number of comprehensions you can nest within each other, which makes it possible to write very complex code in a single line.

Are list comprehensions more efficient than for loops?

Because of differences in how Python implements for loops and list comprehension, list comprehensions are almost always faster than for loops when performing operations.

Are list comprehensions memory efficient than generator comprehensions?

So what's the difference between Generator Expressions and List Comprehensions? The generator yields one item at a time and generates item only when in demand. Whereas, in a list comprehension, Python reserves memory for the whole list. Thus we can say that the generator expressions are memory efficient than the lists.

How much faster are list comprehensions?

As we can see, the for loop is slower than the list comprehension (9.9 seconds vs. 8.2 seconds). List comprehensions are faster than for loops to create lists. But, this is because we are creating a list by appending new elements to it at each iteration.


2 Answers

Update: The walrus-operator := was introduced in Python 3.8, which assigns a variable, but also evaluates to the assigned value. As per @MartijnVanAttekum 's answer. I'd recommend waiting a year or so before using it in projects, because Python 3.6 and 3.7 is still quite mainstream, but it's a nicer solution that my alias suggestion below.

I have a hack to create aliases inside list/dict comprehensions. You can use the for alias_name in [alias_value] trick. For example you have this expensive function:

def expensive_function(x):
    print("called the very expensive function, that will be $2")
    return x*x + x

And some data:

data = [4, 7, 3, 7, 2, 3, 4, 7, 3, 1, 1 ,1]

And then you want to apply the expensive function over each element, and also filter based on it. What you do is:

result = [
    (x, expensive)
    for x in data
    for expensive in [expensive_function(x)] #alias
    if expensive > 3
]

print(result)

The second-for will only iterate over a list of size 1, effectively making it an alias. The output will show that the expensive function is called 12 times, exactly once for each data element. Nevertheless, the result of the function is used (at most) twice, once for the filter and once possible once for the output.

Please, always make sure to layout such comprehensions using multiple lines like I did, and append #alias to the line where the alias is. If you use an alias, the comprehension get's quite complicated, and you should help future readers of your code to get what you're doing. This is not perl, you know ;).

For completeness, the output:

called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
[(4, 20), (7, 56), (3, 12), (7, 56), (2, 6), (3, 12), (4, 20), (7, 56), (3, 12)]

Code: http://ideone.com/7mUQUt

like image 101
Herbert Avatar answered Oct 13 '22 00:10

Herbert


In regards to #1, yes, they will be evaluated multiple times.

In regards to #2, the way to do it is to calculate and filter in separate comprehensions:

Condensed version:

[(x,fx,gx) for (x,fx,gx) in ((x,fx,g(fx)) for (x,fx) in ((x,f(x)) for x in bigList) if fx < p) if gx<q]

Longer version expanded to make it easier to follow:

[(x,f,g) for (x,f,g) in
  ((x,f,g(f)) for (x,f) in
     ((x,f(x)) for x in bigList)
  if f < p)
if g<q]

This will call f and g as few times as possible (values for each f(x) is not < p will never call g, and f will only be called once for each value in bigList).

If you prefer, you can also get neater code by using intermediate variables:

a = ( (x,f(x)) for x in bigList )
b = ( (x,fx,g(fx)) for (x,fx) in a if fx<p )
results = [ c for c in b if c[2] < q ] # faster than writing out full tuples

a and b use generator expressions so that they don't have to actually instantiate lists, and are simply evaluated when necessary.

like image 20
Amber Avatar answered Oct 13 '22 01:10

Amber