I'm trying to set up a "processing pipeline" for data that I'm reading in from a data source, and applying a sequence of operators (using generators) to each item as it is read.
Some sample code that demonstrates the same issue.
def reader():
yield 1
yield 2
yield 3
def add_1(val):
return val + 1
def add_5(val):
return val + 5
def add_10(val):
return val + 10
operators = [add_1, add_5, add_10]
def main():
vals = reader()
for op in operators:
vals = (op(val) for val in vals)
return vals
print(list(main()))
Desired : [17, 18, 19]
Actual: [31, 32, 33]
Python seems to not be saving the value of op
each time through the for loop, so it instead applies the third function each time. Is there a way to "bind" the actual operator function to the generator expression each time through the for loop?
I could get around this trivially by changing the generator expression in the for loop to a list comprehension, but since the actual data is much larger, I don't want to be storing it all in memory at any one point.
You can force a variable to be bound by creating the generator in a new function. eg.
def map_operator(operator, iterable):
# closure value of operator is now separate for each generator created
return (operator(item) for item in iterable)
def main():
vals = reader()
for op in operators:
vals = map_operator(op, vals)
return vals
However, map_operator
is pretty much identical to the map
builtin (in python 3.x). So just use that instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With