Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Generator "chain" in a for loop

I'm trying to set up a "processing pipeline" for data that I'm reading in from a data source, and applying a sequence of operators (using generators) to each item as it is read.

Some sample code that demonstrates the same issue.

def reader():
    yield 1
    yield 2
    yield 3

def add_1(val):
    return val + 1

def add_5(val):
    return val + 5

def add_10(val):
    return val + 10

operators = [add_1, add_5, add_10]

def main():
    vals = reader()

    for op in operators:
        vals = (op(val) for val in vals)

    return vals

print(list(main()))

Desired : [17, 18, 19]
Actual: [31, 32, 33]

Python seems to not be saving the value of op each time through the for loop, so it instead applies the third function each time. Is there a way to "bind" the actual operator function to the generator expression each time through the for loop?

I could get around this trivially by changing the generator expression in the for loop to a list comprehension, but since the actual data is much larger, I don't want to be storing it all in memory at any one point.

like image 343
gtback Avatar asked Jan 25 '16 14:01

gtback


1 Answers

You can force a variable to be bound by creating the generator in a new function. eg.

def map_operator(operator, iterable):
    # closure value of operator is now separate for each generator created
    return (operator(item) for item in iterable)

def main():
    vals = reader()
    for op in operators:
        vals = map_operator(op, vals)   
    return vals

However, map_operator is pretty much identical to the map builtin (in python 3.x). So just use that instead.

like image 135
Dunes Avatar answered Sep 20 '22 09:09

Dunes