Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tornado blocking asynchronous requests

Using Tornado, I have a Get request that takes a long time as it makes many requests to another web service and processes the data, could take minutes to fully complete. I don't want this to block the entire web server from responding to other requests, which it currently does.

As I understand it, Tornado is single threaded and executes each request synchronously, even though it handles them asynchronously (still confused on that bit). There are parts of the long process that could be pause points to allow the server to handle other requests (possible solution?). I'm running it on Heroku with a single worker, so not sure how that translates into spawning a new thread or multiprocessing, which I have no experience in with python.

Here is what I'm trying to do: the client makes the get call to start the process, then I loop through another get call every 5 seconds to check the status and update the page with new information (long polling would also work but running into the same issue). Problem is that starting the long process blocks all new get requests (or new long polling sessions) until it completes.

Is there an easy way to kick off this long get call and not have it block the entire web server in the process? Is there anything I can put in the code to say.. "pause, go handle pending requests then continue on"?

I need to initiate a get request on ProcessHandler. I then need to continue to be able to query StatusHandler while ProcessHandler is running.

Example:

class StatusHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def get(self):
       self.render("status.html")

class ProcessHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def get(self):
       self.updateStatus("0")
       result1 = self.function1()
       self.updateStatus("1")
       result2 = self.function2(result1)
       self.updateStatus("2")
       result3 = self.function3(result2)
       self.updateStatus("3")
       self.finish()
like image 728
ElJeffe Avatar asked Oct 24 '12 14:10

ElJeffe


People also ask

What is async HTTP client in tornado?

HTTPClient(async_client_class: Optional[Type[tornado.httpclient.AsyncHTTPClient]] = None, **kwargs)[source]¶ A blocking HTTP client. This interface is provided to make it easier to share code between synchronous and asynchronous applications.

What are asynchronous operations in tornado?

Asynchronous operations in Tornado generally return placeholder objects ( Futures ), with the exception of some low-level components like the IOLoop that use callbacks. Futures are usually transformed into their result with the await or yield keywords.

How does tornado reduce the cost of synchronous connections?

In a traditional synchronous web server, this implies devoting one thread to each user, which can be very expensive. To minimize the cost of concurrent connections, Tornado uses a single-threaded event loop.

What is simpleasynchttpclient in tornado?

class tornado.simple_httpclient. SimpleAsyncHTTPClient[source]¶ Non-blocking HTTP client with no external dependencies. This class implements an HTTP 1.1 client on top of Tornado’s IOStreams. Some features found in the curl-based AsyncHTTPClient are not yet supported.


1 Answers

Here's a complete sample Tornado app that uses the Async HTTP client and the gen.Task module to make things simple.

If you read more about gen.Task in the docs you'll see that you can actually dispatch multiple requests at the same time. This is using the core idea of Tornado where everything is no blocking and still maintaining a single process.

Update: I've added a Thread handler to demonstrate how you could dispatch work into a second thread and receive the callback() when it's done.

import os
import threading
import tornado.options
import tornado.ioloop
import tornado.httpserver
import tornado.httpclient
import tornado.web
from tornado import gen
from tornado.web import asynchronous

tornado.options.define('port', type=int, default=9000, help='server port number (default: 9000)')
tornado.options.define('debug', type=bool, default=False, help='run in debug mode with autoreload (default: False)')

class Worker(threading.Thread):
   def __init__(self, callback=None, *args, **kwargs):
        super(Worker, self).__init__(*args, **kwargs)
        self.callback = callback

   def run(self):
        import time
        time.sleep(10)
        self.callback('DONE')

class Application(tornado.web.Application):
    def __init__(self):
        handlers = [
            (r"/", IndexHandler),
            (r"/thread", ThreadHandler),
        ]
        settings = dict(
            static_path = os.path.join(os.path.dirname(__file__), "static"),
            template_path = os.path.join(os.path.dirname(__file__), "templates"),
            debug = tornado.options.options.debug,
        )
        tornado.web.Application.__init__(self, handlers, **settings)

class IndexHandler(tornado.web.RequestHandler):
    client = tornado.httpclient.AsyncHTTPClient()

    @asynchronous
    @gen.engine
    def get(self):
        response = yield gen.Task(self.client.fetch, "http://google.com")

        self.finish("Google's homepage is %d bytes long" % len(response.body))

class ThreadHandler(tornado.web.RequestHandler):
    @asynchronous
    def get(self):
        Worker(self.worker_done).start()

    def worker_done(self, value):
        self.finish(value)

def main():
    tornado.options.parse_command_line()
    http_server = tornado.httpserver.HTTPServer(Application())
    http_server.listen(tornado.options.options.port)
    tornado.ioloop.IOLoop.instance().start()

if __name__ == "__main__":
    main()
like image 174
koblas Avatar answered Oct 04 '22 01:10

koblas