I want to share small pieces of informations between my worker nodes (for example cached authorization tokens, statistics, ...) in celery.
If I create a global inside my tasks-file it's unique per worker (My workers are processes and have a life-time of 1 task/execution).
What is the best practice? Should I save the state externally (DB), create an old-fashioned shared memory (could be difficult because of the different pool implementations in celery)?
Thanks in advance!
Celery itself is using billiard (a multiprocessing fork) to run your tasks in separate processes.
The "shared_task" decorator allows creation of Celery tasks for reusable apps as it doesn't need the instance of the Celery app. It is also easier way to define a task as you don't need to import the Celery app instance.
Once you integrate Celery into your app, you can send time-intensive tasks to Celery's task queue. That way, your web app can continue to respond quickly to users while Celery completes expensive operations asynchronously in the background.
It looks like celery does not release memory after task finished. Every time a task finishes, there would be 5m-10m memory leak. So with thousands of tasks, soon it will use up all memory.
I finally found a decent solution - core python multiprocessing-Manager:
from multiprocessing import Manager
manag = Manager()
serviceLock = manag.Lock()
serviceStatusDict = manag.dict()
This dict can be accessed from every process, it's synchronized, but you have to use a lock when accessing it concurrently (like in every other shared memory implementation).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With