Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Two functions, One generator

Tags:

python

I have two functions which both take iterators as inputs. Is there a way to write a generator which I can supply to both functions as input, which would not require a reset or a second pass through? I want to do one pass over the data, but supply the output to two functions: Example:

def my_generator(data):
    for row in data:
        yield row

gen = my_generator(data)
func1(gen)
func2(gen)

I know I could have two different generator instances, or reset in between functions, but was wondering if there is a way to avoid doing two passes on the data. Note that func1/func2 themselves are NOT generators, which would be nice cause I could then have a pipeline.

The point here is to try and avoid a second pass over the data.

like image 407
bcollins Avatar asked Feb 14 '16 18:02

bcollins


People also ask

Can a generator function have multiple yield expressions?

If you want to return multiple values from a function, you can use generator functions with yield keywords. The yield expressions return multiple values. They return one value, then wait, save the local state, and resume again.

What is generator function example?

Syntax : // An example of generator function function* gen(){ yield 1; yield 2; ... ... } The Generator object is returned by a generating function and it conforms to both the iterable protocol and the iterator protocol.

What are the signs that a function is a generator function?

If a function contains at least one yield statement (it may contain other yield or return statements), it becomes a generator function. Both yield and return will return some value from a function.

What is the difference between generator and function?

2. Memory Efficient: Generator Functions are memory efficient, as they save a lot of memory while using generators. A normal function will return a sequence of items, but before giving the result, it creates a sequence in memory and then gives us the result, whereas the generator function produces one output at a time.


1 Answers

You can either cache generators result into a list, or reset the generator to pass data into func2. The problem is that if one have 2 loops, one needs to iterate over the data twice, so either one loads the data again and create a generator or one caches the entire result.

Solutions like itertools.tee will also just create 2 iteratvies, which is basically the same as resetting the generator after first iteration. Of course it is syntactic sugar but it won't change the situation in the background.

If you have big data here, you have to merge func1 and func2.

for a in gen:
   f1(a)
   f2(a)

In practice it can be a good idea to design code like this, so one has full control over iteration process and is able associate/compose maps and filters using a single iterative.

like image 156
Nicolas Heimann Avatar answered Sep 22 '22 08:09

Nicolas Heimann