Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nesting generator expression calling a dynamically referenced function

I'm seeing some really odd behavior that I am not sure how to explain, when dynamically nesting generator expressions in Python 3, when the generator expression references a function which is dynamically referenced.

Here is a very simplified case reproducing the problem:

double = lambda x: x * 2
triple = lambda x: x * 3
processors = [double, triple]

data = range(3)
for proc in processors:
    data = (proc(i) for i in data)

result = list(data)
print(result)
assert result == [0, 6, 12]

In this case, I expected each number to be multiplied by 6 (triple(double(x))) but in reality triple(triple(x)) is called. It's more or less clear to me that proc points to triple when the generator expression is run, regardless of what it pointed to when the generator expression was created.

So, (1) is this expected and can someone point to some relevant info in the Python docs or elsewhere explaining this?

and (2) Can you recommend another method of nesting generator expressions, where each level calls a dynamically provided callable?

EDIT: I am seeing it on Python 3.8.x, haven't tested with other versions

like image 409
shevron Avatar asked Jan 29 '26 06:01

shevron


2 Answers

This is a result of two things:

  • Generators are lazily evaluated, so the functions are only called when the generator is consumed,
  • Names are resolved at evaluation time, not when the generator is created.

So at the time you consume the generator with list(data), the name proc refers to the function triple, and both generators call the function bound by the name proc, so you get triple twice.

The reason map works is because it's a function, so when you pass proc as an argument, it receives the value of proc at the time map is called, which is in the loop while proc still can refer to the double function.

like image 146
kaya3 Avatar answered Jan 31 '26 22:01

kaya3


Yes, it's expected, and you got the reason right.

As generators are lazy, proc(i) gets evaluated only when requested. Which involves evaluating proc and i then. And when you finally do request, proc is already triple, so that's what gets used.

In this particular case, data = map(proc, data) does the job. It works because map captures and remembers the proc as it was when you called map.

You could do the same with a generator function. I tried with a generator expression like

data = (p(i) for p in [proc] for i in data)

but it failed with ValueError: generator already executing. This worked, though:

data = (lambda proc: (proc(i) for i in data))(proc)
like image 30
Manuel Avatar answered Jan 31 '26 20:01

Manuel