Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Celery task re-queued into broker on graceful shutdown when run locally but get lost in kubernetes despite same configs

I have a celery running in k8 pod. This is my manifest for celery

apiVersion: apps/v1
kind: Deployment
metadata:
  name: celery
  labels:
    deployment: celery
spec:
  replicas: 2
  selector:
    matchLabels:
      pod: celery
  template:
    metadata:
      labels:
        pod: celery
    spec:
      containers:
        - name: celery
          image: local_celery:latest
          imagePullPolicy: Never
          command: ['celery', '-A', 'proj', 'worker', '-E', '-l', 'info',]
          resources:
            limits:
              cpu: 50m
            requests:
              cpu: 50m

      terminationGracePeriodSeconds: 25

My Celery Configs in django settings.py are


CELERY_TASK_ACKS_LATE = True
CELERY_WORKER_PREFETCH_MULTIPLIER = 1
CELERY_BROKER_URL = 'redis://redis:6379'
CELERY_RESULT_BACKEND = 'django-db'
CELERY_WORKER_CONCURRENCY=1
CEELERY_TASK_REJECT_ON_WORKER_LOST=True

When I run a simple django app with celery and redis as message broker, My task get re-queued into broker when i do ctrl-C to initiate a warm shutdown for the worker. But when the same application is deployed to kubernetes with celery, django and redis running in 3 different pods my tasks aren't re-queued back to redis when celery pod is gracefully terminated. I am unable to understand why? My celery settings are unchanged in both cases.

like image 377
sap Avatar asked Sep 01 '25 03:09

sap


1 Answers

Are you sure your celery worker is being shutdown gracefully on kubernetes?

It is not mentioned enough when they say k8s sends SIGTERM signal on pod/container when it is terminating and SIGKILL after terminationgraceperiod if the pod is still there, that, k8s sends that SIGTERM to process id no 1 (PID 1). So if your main celery worker process is not PID 1 the graceful shutdown will not happen. this happens if you are not running the actual celery command inline and run through from script file with sh/bash.

also we must implement signal handler when running process in pid1. celery in our case has obviously has that. read more

like image 73
Suraj Shrestha Avatar answered Sep 02 '25 18:09

Suraj Shrestha