It looks like celery does not release memory after task finished. Every time a task finishes, there would be 5m-10m memory leak. So with thousands of tasks, soon it will use up all memory.
BROKER_URL = 'amqp://user@localhost:5672/vhost'
# CELERY_RESULT_BACKEND = 'amqp://user@localhost:5672/vhost'
CELERY_IMPORTS = (
'tasks.tasks',
)
CELERY_IGNORE_RESULT = True
CELERY_DISABLE_RATE_LIMITS = True
# CELERY_ACKS_LATE = True
CELERY_TASK_RESULT_EXPIRES = 3600
# maximum time for a task to execute
CELERYD_TASK_TIME_LIMIT = 600
CELERY_DEFAULT_ROUTING_KEY = "default"
CELERY_DEFAULT_QUEUE = 'default'
CELERY_DEFAULT_EXCHANGE = "default"
CELERY_DEFAULT_EXCHANGE_TYPE = "direct"
# CELERYD_MAX_TASKS_PER_CHILD = 50
CELERY_DISABLE_RATE_LIMITS = True
CELERYD_CONCURRENCY = 2
Might be same with issue, but it does not has an answer: RabbitMQ/Celery/Django Memory Leak?
I am not using django, and my packages are:
Chameleon==2.11
Fabric==1.6.0
Mako==0.8.0
MarkupSafe==0.15
MySQL-python==1.2.4
Paste==1.7.5.1
PasteDeploy==1.5.0
SQLAlchemy==0.8.1
WebOb==1.2.3
altgraph==0.10.2
amqp==1.0.11
anyjson==0.3.3
argparse==1.2.1
billiard==2.7.3.28
biplist==0.5
celery==3.0.19
chaussette==0.9
distribute==0.6.34
flower==0.5.1
gevent==0.13.8
greenlet==0.4.1
kombu==2.5.10
macholib==1.5.1
objgraph==1.7.2
paramiko==1.10.1
pycrypto==2.6
pyes==0.20.0
pyramid==1.4.1
python-dateutil==2.1
redis==2.7.6
repoze.lru==0.6
requests==1.2.3
six==1.3.0
tornado==3.1
translationstring==1.1
urllib3==1.6
venusian==1.0a8
wsgiref==0.1.2
zope.deprecation==4.0.2
zope.interface==4.0.5
I just added a test task like, test_string is a big string, and it still has memory leak:
@celery.task(ignore_result=True)
def process_crash_xml(test_string, client_ip, request_timestamp):
logger.info("%s %s" % (client_ip, request_timestamp))
test = [test_string] * 5
There is a memory leak in the parent process of Celery's worker. It is not a child process executing a task. It happens suddenly every few days. Unless you stop Celery, it consumes server memory in tens of hours. This problem happens at least in Celery 4.1, and it also occurs in Celery 4.2.
celery purge offers to erase tasks from one of the broadcast queues, and I don't see an option to pick a different named queue.
As for --concurrency celery by default uses multiprocessing to perform concurrent execution of tasks. The number of worker processes/threads can be changed using the --concurrency argument and defaults to the number of available CPU's if not set.
To trace most memory blocks allocated by Python, the module should be started as early as possible by setting the PYTHONTRACEMALLOC environment variable to 1 , or by using -X tracemalloc command line option. The tracemalloc. start() function can be called at runtime to start tracing Python memory allocations.
It was this config option that made my worker does not release memory.
CELERYD_TASK_TIME_LIMIT = 600
refer to: https://github.com/celery/celery/issues/1427
There are two settings which can help you mitigate growing memory consumption of celery workers:
Max tasks per child setting (v2.0+):
With this option you can configure the maximum number of tasks a worker can execute before it’s replaced by a new process. This is useful if you have memory leaks you have no control over for example from closed source C extensions.
Max memory per child setting (v4.0+):
With this option you can configure the maximum amount of resident memory a worker can execute before it’s replaced by a new process. This is useful if you have memory leaks you have no control over for example from closed source C extensions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With