Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python http server, multiple simultaneous requests

I have developed a rather extensive http server written in python utilizing tornado. Without setting anything special, the server blocks on requests and can only handle one at a time. The requests basically access data (mysql/redis) and print it out in json. These requests can take upwards of a second at the worst case. The problem is that a request comes in that takes a long time (3s), then an easy request comes in immediately after that would take 5ms to handle. Well since that first request is going to take 3s, the second one doesn't start until the first one is done. So the second request takes >3s to be handled.

How can I make this situation better? I need that second simple request to begin executing regardless of other requests. I'm new to python, and more experienced with apache/php where there is no notion of two separate requests blocking each other. I've looked into mod_python to emulate the php example, but that seems to block as well. Can I change my tornado server to get the functionality that I want? Everywhere I read, it says that tornado is great at handling multiple simultaneous requests.

Here is the demo code I'm working with. I have a sleep command which I'm using to test if the concurrency works. Is sleep a fair way to test concurrency?

import tornado.httpserver
import tornado.ioloop
import tornado.web
import tornado.gen
import time

class MainHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    @tornado.gen.engine

    def handlePing1(self):
        time.sleep(4)#simulating an expensive mysql call
        self.write("response to browser ....")
        self.finish()

    def get(self):
        start = time.time()
        self.handlePing1()
        #response = yield gen.Task(handlePing1)#i see tutorials around that suggest using something like this ....

        print "done with request ...", self.request.path, round((time.time()-start),3)



application = tornado.web.Application([
        (r"/.*", MainHandler),
])

if __name__ == "__main__":
    http_server = tornado.httpserver.HTTPServer(application)
    port=8833;
    http_server.listen(port)
    print "listening on "+str(port);
    tornado.ioloop.IOLoop.instance().start()

Thanks for any help!

like image 408
Landon Avatar asked Aug 13 '13 22:08

Landon


1 Answers

Edit: remember that Redis is also single threaded, so even if you have concurrent requests, your bottleneck will be Redis. You won't be able to process more requests because Redis won't be able to process them.

Tornado is single-threaded, event-loop based server.

From the documentation:

By using non-blocking network I/O, Tornado can scale to tens of thousands of open connections, making it ideal for long polling, WebSockets, and other applications that require a long-lived connection to each user.

Concurrency in tornado is achieved through asynchronous callbacks. The idea is to do as little as possible in the main event loop (single-threaded) to avoid blocking and defer i/o operations through callbacks.

If using asynchronous operations doesn't work for you (ex: no async driver for MySQL, or Redis), your only way of handling more concurrent requests is to run multiple processes.

The easiest way is to front your tornado processes with a reverse-proxy like HAProxy or Nginx. The tornado doc recommends Nginx: http://www.tornadoweb.org/en/stable/overview.html#running-tornado-in-production

Your basically run multiple versions of your app on different ports. Ex:

python app.py --port=8000
python app.py --port=8001
python app.py --port=8002
python app.py --port=8003 

A good rule of thumb is to run 1 process for each core on your server.

Nginx will take care of balancing each incoming requests to the different backends. So if one of the request is slow (~ 3s) you have n-1 other processes listening for incoming requests. It is possible – and very likely – that all processes will be busy processing a slow-ish request, in which case requests will be queued and processed when any process becomes free, eg. finished processing the request.

I strongly recommend you start with Nginx before trying HAProxy as the latter is a little bit more advanced and thus a bit more complex to setup properly (lots of switches to tweak).

Hope this helps. Key take-away: Tornado is great for async I/O, less so for CPU heavy workloads.

like image 95
Tim Avatar answered Sep 19 '22 12:09

Tim