Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sharing static global data among processes in a Gunicorn / Flask app

I have a Flask app running under Gunicorn, using the sync worker type with 20 worker processes. The app reads a lot of data on startup, which takes time and uses memory. Worse, each process loads its own copy, which causes it to take even longer and take 20X the memory. The data is static and doesn't change. I'd like to load it once and have all 20 workers share it.

If I use the preload_app setting, it only loads in one thread, and initially only takes 1X memory, but then seems to baloon to 20X once requests start coming in. I need fast random access to the data, so I'd rather not do IPC.

Is there any way to share static data among Gunicorn processes?

like image 873
Doctor J Avatar asked Nov 10 '14 22:11

Doctor J


People also ask

Are global variables thread-safe in Flask how do I share data between requests?

You can't use global variables to hold this sort of data. Not only is it not thread safe, it's not process safe, and WSGI servers in production spawn multiple processes. Not only would your counts be wrong if you were using threads to handle requests, they would also vary depending on which process handled the request.

Do Gunicorn workers shared memory?

Gunicorn also allows for each of the workers to have multiple threads. In this case, the Python application is loaded once per worker, and each of the threads spawned by the same worker shares the same memory space. Gunicorn with threads setting, which uses the gthread worker class.

Are Gunicorn workers processes or threads?

Gunicorn is based on the pre-fork worker model. This means that there is a central master process that manages a set of worker processes. The master never knows anything about individual clients. All requests and responses are handled completely by worker processes.

Does Gunicorn use multiprocessing?

Upon running a Gunicorn server, multiple processes, a.k.a 'workers', are spawned-up to handle individual requests that the application receives.


2 Answers

Memory mapped files will allow you to share pages between processes.

https://docs.python.org/3/library/mmap.html

Note that memory consumption statistics are usually misleading and unhelpful. It is usually better to consider the output of vmstat and see if you are swapping a lot.

like image 193
aaa90210 Avatar answered Sep 26 '22 00:09

aaa90210


Assuming your priority is to keep the data as a Python data structure instead of moving it to a database such as Redis, then you'll have to change things so that you can use a single process for your server.

Gunicorn can work with gevent to create a server that can support multiple clients within a single worker process using coroutines, that could be a good option for your needs.

like image 24
Miguel Avatar answered Sep 25 '22 00:09

Miguel