Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Celery Tasks are executing multiple times in Django Application

I have a Django application where I defined a few @task functions under task.py to execute at given periodic task. I'm 100% sure that the issue is not caused by task.py or any code related but due to some configuration may be in settings.py or my celery worker.

Task does execute at periodic task but at multiple times.

Here are the celery worker logs:

celery -A cimexmonitor worker --loglevel=info -B -c 4

[2019-09-19 21:22:16,360: INFO/ForkPoolWorker-5] Project Monitor Started : APPProject1
[2019-09-19 21:22:16,361: INFO/ForkPoolWorker-4] Project Monitor Started : APPProject1
[2019-09-19 21:25:22,108: INFO/ForkPoolWorker-4] Project Monitor DONE : APPProject1
[2019-09-19 21:25:45,255: INFO/ForkPoolWorker-5] Project Monitor DONE : APPProject1
[2019-09-20 00:22:16,395: INFO/ForkPoolWorker-4] Project Monitor Started : APPProject2
[2019-09-20 00:22:16,398: INFO/ForkPoolWorker-5] Project Monitor Started : APPProject2
[2019-09-20 01:22:11,554: INFO/ForkPoolWorker-5] Project Monitor DONE : APPProject2
[2019-09-20 01:22:12,047: INFO/ForkPoolWorker-4] Project Monitor DONE : APPProject2
  • If you check above time interval, tasks.py executes one task but 2 workers of celery takes the task & executes the same task at the same interval. I'm not sure why 2 workers took for one task?

  • settings.py

..
..
# Internationalization
# https://docs.djangoproject.com/en/2.1/topics/i18n/

LANGUAGE_CODE = 'en-us'

TIME_ZONE = 'Asia/Kolkata'

USE_I18N = True

USE_L10N = True

USE_TZ = True
..
..
..
######## CELERY : CONFIG
CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_ENABLE_UTC = True
CELERYBEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'
  • celery.py
from __future__ import absolute_import, unicode_literals
from celery import Celery 
import os
from django.conf import settings

os.environ.setdefault('DJANGO_SETTINGS_MODULE','cimexmonitor.settings')
## set the default Django settings module for the 'celery' program.

# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
#   should have a `CELERY_` prefix.

app = Celery('cimexmonitor')
#app.config_from_object('django.conf:settings', namespace='CELERY') 
app.config_from_object('django.conf:settings')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks(settings.INSTALLED_APPS)

@app.task(bind=True)
def debug_task(self):
    print('Request: {0!r}'.format(self.request))
  • Other information:
→ celery --version
4.3.0 (rhubarb)

→ redis-server --version
Redis server v=3.0.6 sha=00000000:0 malloc=jemalloc-3.6.0 bits=64 build=7785291a3d2152db

django-admin-interface==0.9.2
django-celery-beat==1.5.0
  • Please help me the ways to debug the problem:

Thanks

like image 602
Arbazz Hussain Avatar asked Sep 20 '19 13:09

Arbazz Hussain


People also ask

How do you stop the execution of a Celery task?

revoke cancels the task execution. If a task is revoked, the workers ignore the task and do not execute it. If you don't use persistent revokes your task can be executed after worker's restart. revoke has an terminate option which is False by default.

How many tasks can Celery handle?

celery beats only trigger those 1000 tasks (by the crontab schedule), not run them. If you want to run 1000 tasks in parallel, you should have enough celery workers available to run those tasks.

Does Celery use multiprocessing?

Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operations but supports scheduling as well. The execution units, called tasks, are executed concurrently on one or more worker servers using multiprocessing, Eventlet, or gevent.


2 Answers

Both the worker and beat services need to be running at the same time to execute periodically task as per https://github.com/celery/django-celery-beat

  • WORKER:
 $ celery -A [project-name] worker --loglevel=info -B -c 5
  • Django scheduler:
celery -A [project-name] beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler
  • I was running both worker,database scheduler at same time, which was said as per the documentation, Which was causing the issues to be executed at the same time, I'm really not sure how celery worker started working as a DB scheduler at the same time.
  • just running celery worker solved my problem.
like image 180
Arbazz Hussain Avatar answered Oct 12 '22 09:10

Arbazz Hussain


From the official documentation: Ensuring a task is only executed one at a time.

Also, I hope you are not running multiple workers the same way (celery -A cimexmonitor worker --loglevel=info -B -c 4) as that would mean you have multiple celery beats scheduling tasks to run... In short - make sure you only have one Celery beat running!

like image 33
DejanLekic Avatar answered Oct 12 '22 08:10

DejanLekic