Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Yielding a value from a coroutine in Python, a.k.a. convert callback to generator

I'm new to Python and functional programming. I'm using version 2.7.6

I'm using the Tornado framework to make async network requests. From what I learned about functional programming, I want my data to stream through my code by using generators. I have done most of what I need using generators and transforming the data as they stream through my function calls.

At the very end of my stream, I want to make a REST request for some data. I have one for-loop just before I submit my data to Tornado, to initiate the pull, and then send the http request. The http object provided by Tornado takes a callback function as an option, and always returns a Future--which is actually a Tornado Future object, and not the official Python Future.

My problem is that since I'm now using generators to pull my data through my code, I no longer want to use the callback function. My reasoning for this is that after I get my data back from the callback, my data is now being pushed through my code, and I can no longer make use of generators.

My goal is to create an interface that appears like so:

urls = (...generated urls...)
responses = fetch(urls)

Where responses is a generator over the completed urls.

What I attempted to do--among many things--is convert the results from the callback into a generator. I was thinking about something like this, although I'm far from implementing it for other issues I will soon explain. However, I wanted my fetch function to look something like this:

def fetch(urls):
    def url_generator():
        while True:
            val = yield
            yield val

    @curry
    def handler(gen, response):
        gen.send(response)

    gen = url_generator()

    for u in urls:
        http.fetch(u, callback=handler(gen))

    return gen

I simplified the code and syntax to focus on the problem, but I figured this was going to work fine. My strategy was to define a coroutine/generator which I will then send the responses to, as I receive them.

What I'm having the most trouble with is the coroutine/generator. Even if I define a generator in the above manner and perform the following, then I get an infinite loop--this is one of my main problems.

def gen():
    while True:
        val = yield
        print 'val', val
        yield val
        print 'after', val
        break

g = gen()
g.send(None)
g.send(10)

for e in g:
    print e

This prints val 10 after 10 in the coroutine as expected with the break, but the for-loop never gets the value of 10. It doesn't print anything while the break is there. If I remove the break, then I get the infinite loop:

val None
None
after None
None
val None
None
after None
None
...

If I remove the for-loop, then the coroutine will only print val 10 as it waits on the second yield. I expect this. However, using it doesn't produce anything.

Similarly, if I remove the for-loop and replace it with print next(g), then I get a StopIteration error, which I assume means I called next on a generator that had no more values.

Anywho, I am at a complete loss while I plunge into more depth on Python. I figure this is such a common situation in Python that somebody knows a great approach. I searched for 'convert callback into generator' and such, but didn't have much luck.

On another note, I could possibly yield each future from the http request, but I didn't have much luck "waiting" on the yield for the future to complete. I read a lot about 'yield from', but it seems to be Python 3 specific and Tornado doesn't seem to work on Python 3 yet.

Thanks for viewing, and thanks for any help you can provide.

like image 926
Joe Avatar asked Aug 07 '15 04:08

Joe


1 Answers

Tornado works great on Python 3.

The problem with your simplified code above is that this isn't doing what you expect:

val = yield

You expect the generator to pause there (blocking your for-loop) until some other function calls g.send(value), but that's not what happens. Instead, the code behaves like:

val = yield None

So the for-loop receives None values as fast as it can process them. After it receives each None, it implicitly calls g.next(), which is the same as g.send(None). So, your code is equivalent to this:

def gen():
    while True:
        val = yield None
        print 'val', val
        yield val
        print 'after', val

g = gen()
g.send(None)
g.send(10)

while True:
    try:
        e = g.send(None)
        print e
    except StopIteration:
        break

Reading this version of the code, where the implicit behaviors are made explicit, I hope it's clear why it's just generating None in an infinite loop.

What you need is some way for one function to add items to the head of a queue, while another function blocks waiting for items, and pulls them off the tail of the queue when they're ready. Starting in Tornado 4.2 we have exactly that:

http://www.tornadoweb.org/en/stable/queues.html

The web spider example is close to what you want to do, I'm sure you can adapt it:

http://www.tornadoweb.org/en/stable/guide/queues.html

like image 81
A. Jesse Jiryu Davis Avatar answered Oct 02 '22 13:10

A. Jesse Jiryu Davis