Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Celery does not release memory

It looks like celery does not release memory after task finished. Every time a task finishes, there would be 5m-10m memory leak. So with thousands of tasks, soon it will use up all memory.

BROKER_URL = 'amqp://user@localhost:5672/vhost'
# CELERY_RESULT_BACKEND = 'amqp://user@localhost:5672/vhost'

CELERY_IMPORTS = (
    'tasks.tasks',
)

CELERY_IGNORE_RESULT = True
CELERY_DISABLE_RATE_LIMITS = True
# CELERY_ACKS_LATE = True
CELERY_TASK_RESULT_EXPIRES = 3600
# maximum time for a task to execute
CELERYD_TASK_TIME_LIMIT = 600
CELERY_DEFAULT_ROUTING_KEY = "default"
CELERY_DEFAULT_QUEUE = 'default'
CELERY_DEFAULT_EXCHANGE = "default"
CELERY_DEFAULT_EXCHANGE_TYPE = "direct"
# CELERYD_MAX_TASKS_PER_CHILD = 50
CELERY_DISABLE_RATE_LIMITS = True
CELERYD_CONCURRENCY = 2

Might be same with issue, but it does not has an answer: RabbitMQ/Celery/Django Memory Leak?

I am not using django, and my packages are:

Chameleon==2.11
Fabric==1.6.0
Mako==0.8.0
MarkupSafe==0.15
MySQL-python==1.2.4
Paste==1.7.5.1
PasteDeploy==1.5.0
SQLAlchemy==0.8.1
WebOb==1.2.3
altgraph==0.10.2
amqp==1.0.11
anyjson==0.3.3
argparse==1.2.1
billiard==2.7.3.28
biplist==0.5
celery==3.0.19
chaussette==0.9
distribute==0.6.34
flower==0.5.1
gevent==0.13.8
greenlet==0.4.1
kombu==2.5.10
macholib==1.5.1
objgraph==1.7.2
paramiko==1.10.1
pycrypto==2.6
pyes==0.20.0
pyramid==1.4.1
python-dateutil==2.1
redis==2.7.6
repoze.lru==0.6
requests==1.2.3
six==1.3.0
tornado==3.1
translationstring==1.1
urllib3==1.6
venusian==1.0a8
wsgiref==0.1.2
zope.deprecation==4.0.2
zope.interface==4.0.5

I just added a test task like, test_string is a big string, and it still has memory leak:

@celery.task(ignore_result=True)
def process_crash_xml(test_string, client_ip, request_timestamp):
    logger.info("%s %s" % (client_ip, request_timestamp))
    test = [test_string] * 5
like image 742
Dechao Qiu Avatar asked Jul 09 '13 06:07

Dechao Qiu


People also ask

What is celery memory?

There is a memory leak in the parent process of Celery's worker. It is not a child process executing a task. It happens suddenly every few days. Unless you stop Celery, it consumes server memory in tens of hours. This problem happens at least in Celery 4.1, and it also occurs in Celery 4.2.

What does celery purge do?

celery purge offers to erase tasks from one of the broadcast queues, and I don't see an option to pick a different named queue.

What is concurrency in celery?

As for --concurrency celery by default uses multiprocessing to perform concurrent execution of tasks. The number of worker processes/threads can be changed using the --concurrency argument and defaults to the number of available CPU's if not set.

How do I use Tracemalloc in Python 3?

To trace most memory blocks allocated by Python, the module should be started as early as possible by setting the PYTHONTRACEMALLOC environment variable to 1 , or by using -X tracemalloc command line option. The tracemalloc. start() function can be called at runtime to start tracing Python memory allocations.


2 Answers

It was this config option that made my worker does not release memory.

CELERYD_TASK_TIME_LIMIT = 600

refer to: https://github.com/celery/celery/issues/1427

like image 110
Dechao Qiu Avatar answered Oct 02 '22 01:10

Dechao Qiu


There are two settings which can help you mitigate growing memory consumption of celery workers:

  • Max tasks per child setting (v2.0+):

    With this option you can configure the maximum number of tasks a worker can execute before it’s replaced by a new process. This is useful if you have memory leaks you have no control over for example from closed source C extensions.

  • Max memory per child setting (v4.0+):

    With this option you can configure the maximum amount of resident memory a worker can execute before it’s replaced by a new process. This is useful if you have memory leaks you have no control over for example from closed source C extensions.

like image 33
Erik Kalkoken Avatar answered Oct 02 '22 01:10

Erik Kalkoken