I am trying to serve long running requests using gunicorn and its async workers but I can't find any examples that I can get to work. I used the example here but tweaked to add a fake delay (sleep for 5s) before returning the response:
def app(environ, start_response):
data = "Hello, World!\n"
start_response("200 OK", [
("Content-Type", "text/plain"),
("Content-Length", str(len(data)))
])
time.sleep(5)
return iter([data])
Then I run gunicorn so:
gunicorn -w 4 myapp:app -k gevent
When I open up two browser tabs and type in http://127.0.0.1:8000/
in both of them and send the requests almost at the same time, the requests appear to get processed sequentially - one returns after 5 seconds and the other returns after a further 5 seconds.
Q. I am guessing the sleep isn't gevent friendly? But there are 4 workers and so even if the type of worker was 'sync' two workers should handle two requests simultaneously?
Gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second. Generally we recommend (2 x $num_cores) + 1 as the number of workers to start off with. From threads, The number of worker threads for handling requests.
Gunicorn also allows for each of the workers to have multiple threads. In this case, the Python application is loaded once per worker, and each of the threads spawned by the same worker shares the same memory space.
Gunicorn relies on the operating system to provide all of the load balancing when handling requests.
threads. Default is 1. This tells number of threads in each worker process. This means that each gunicorn worker is single threaded and isn't multithreaded.
I just ran into the same thing, opened a question here: Requests not being distributed across gunicorn workers . The result is, it appears that the browser serializes access to the same page. I'm guessing perhaps this has something to do w/ cacheability, i.e. the browser thinks it's likely the page is cacheable, wait until it loads finds out it isn't so it makes another request and so on.
Give gevent.sleep
a shot instead of time.sleep
.
It's weird that this is happening with -w 4
, but -k gevent
is an async worker type, so it's possible gunicorn is feeding both requests to the same client. Assuming that's what's happening, time.sleep
will lock your process unless you use gevent.monkey.patch_all()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With