Gunicorn Workers and Threads

Tags:

gunicorn

In terms of Gunicorn, I am aware there are various worker classes but for this conversation I am just looking at the sync and async types.

From my understanding ...

sync workers = (2 * cpu) + 1 worker_class = sync  async (gevent) workers = 1 worker_class = gevent worker_connections = a value (lets say 2000)

So (based on a 4 core system) using sync workers I can have a maximum of 9 connections processing in parallel. With Async I can have up to 2000, with the caveats that come with async.

Questions

So where do threads fit in? Can I add threads to both the sync and async worker types?
What is the best option around gunicorn workers? Should I wish to place gunicorn in front of a Django API, with the requirement of processing 100s of requests in parallel?
Are gevent and sync worker classes thread safe?

986

asked Jul 17 '16 20:07

1 Answers

Let me attempt an answer. Let us assume that at the beginning my deployment only has a single gunicorn worker. This allows me to handle only one request at a time. My worker's work is just to make a call to google.com and get the search results for a query. Now I want to increase my throughput. I have the below options:

Keep one worker only and increase number of threads in that worker

This is the easiest. Since threads are more lightweight (less memory consumption) than processes, I keep only one worker and add several threads to that. Gunicorn will ensure that the master can then send more than one requests to the worker. Since the worker is multithreaded, it is able to handle 4 requests. Fantastic. Now why would I need more workers ever?

To answer that, assume that I need to do some work on the search results that google returned. For instance I might also want to calculate a prime number for each result query. Now I am making my workload compute bound and I hit the problem with python's global interpreter lock. Even though I have 4 threads, only one thread can actually process the results at a time. This means to get true parallel performance I need more than one worker.

Increase Number of workers but all workers are single threaded

So why I need this would be when I need to get true parallel processing. Each worker can parallely make a call to google.com, get results and do any processing. All in parallel. Fantastic. But the downside is that processes are heavier, and my system might not keep up with the demands of increasing workers to accomplish parallelism. So the best solution is to increase workers and also add more threads to each worker.

Increase Number of workers and each worker is multithreaded

I guess this needs no further explanation.

Change worker type to Async

Now why would I ever want to do this? To answer, remember that even threads consume memory. There are coroutines (a radical construct that you can look up) implemented by gevent library that allow you to get threads without having to create threads. SO if you craft your gunicorn to use worker-type of gevent, you get the benefit of NOT having to create threads in your workers. Assume that you are getting threads w/o having to explicitly create them.

So, to answer your question, if you are using worker_type of anything other than Sync, you do not need to increase the number of threads in your gunicorn configuration. You can do it, by all means, but it kinda defeats the purpose.

Hope this helped.

I will also attempt to answer the specific questions.

No, the threaded option is not present for the Async worker class. This actually needs to be made clearer through the documentation. Wondering why that has not happened.
This is a question that needs more knowledge of your specific application. If the processing of these 100s of parallel requests just involves I/O kind of operations, like fetching from DB, saving, collecting data from some other application, then you can make use of the threaded worker. But if that is not the case and you want to execute on a n core CPU because the tasks are extremely compute bound, maybe like calculating primes, you need to make use of the Sync worker. The reasoning for Async is slightly different. To use Async, you need to be sure that your processing is not compute bound, this means you will not be able to make use of multiple cores. Advantage you get is that the memory that multiple threads would take would not be there. But you have other issues like non monkey patched libraries. Move to Async only if the threaded worker does not meet your requirements.
Sync, non threaded workers are the best option if you want absolute thread safety amongst your libraries.

191

answered Dec 04 '22 15:12

abhayAndPoorvisDad

Related questions
                            
                                What is the purpose of NGINX and Gunicorn running in parallel?
                            
                                What is the correct way to leave gunicorn running?
                            
                                Running gunicorn on https?
                            
                                Why is Flask application not creating any logs when hosted by Gunicorn?
                            
                                110: Connection timed out (Nginx/Gunicorn)
                            
                                Docker/Kubernetes + Gunicorn/Celery - Multiple Workers vs Replicas?
                            
                                How can I modify Procfile to run Gunicorn process in a non-standard folder on Heroku?
                            
                                How do I run a flask app in gunicorn if I used the application factory pattern?
                            
                                What benefit is added by using Gunicorn + Nginx + Flask? [duplicate]
                            
                                How to stop gunicorn properly
                            
                                A better way to restart/reload Gunicorn (via Upstart) after 'git pull'ing my Django projects
                            
                                python NameError: name 'file' is not defined
                            
                                Heroku + gunicorn not working (bash: gunicorn: command not found )
                            
                                Debugging a Flask app running in Gunicorn
                            
                                Heroku truncates HTTP responses?
                            
                                Gunicorn, no module named 'myproject
                            
                                Deploying Django with gunicorn and nginx
                            
                                How to make Django serve static files with Gunicorn?
                            
                                Does Gunicorn run on Windows
                            
                                gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3> django

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With