Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to configure Celery Worker and Beat for Email Reporting in Apache Superset running on Docker?

I am running Superset via Docker. I enabled the Email Report feature and tried it:

image

However, I only receive the test email report. I don't receive any emails after.

This is my CeleryConfig in superset_config.py:

class CeleryConfig(object):
    BROKER_URL = 'sqla+postgresql://superset:superset@db:5432/superset'
    CELERY_IMPORTS = (
        'superset.sql_lab',
        'superset.tasks',
    )
    CELERY_RESULT_BACKEND = 'db+postgresql://superset:superset@db:5432/superset'
    CELERYD_LOG_LEVEL = 'DEBUG'
    CELERYD_PREFETCH_MULTIPLIER = 10
    CELERY_ACKS_LATE = True
    CELERY_ANNOTATIONS = {
        'sql_lab.get_sql_results': {
            'rate_limit': '100/s',
        },
        'email_reports.send': {
            'rate_limit': '1/s',
            'time_limit': 120,
            'soft_time_limit': 150,
            'ignore_result': True,
        },
    }
    CELERYBEAT_SCHEDULE = {
        'email_reports.schedule_hourly': {
            'task': 'email_reports.schedule_hourly',
            'schedule': crontab(minute=1, hour='*'),
        },
    }

The documentation says I need to run the celery worker and beat.

celery worker --app=superset.tasks.celery_app:app --pool=prefork -O fair -c 4
celery beat --app=superset.tasks.celery_app:app

I added them to the 'docker-compose.yml':

superset-worker:
    build: *superset-build
    command: >
      sh -c "celery worker --app=superset.tasks.celery_app:app -Ofair -f /app/celery_worker.log &&
             celery beat --app=superset.tasks.celery_app:app -f /app/celery_beat.log"
    env_file: docker/.env
    restart: unless-stopped
    depends_on: *superset-depends-on
    volumes: *superset-volumes

Celery Worker is indeed working when sending the first email. The log file is also visible. However, the celery beat seems to not be functioning. There is also no 'celery_beat.log' created.

If you'd like a deeper insight, here's the commit with the full implementation of the functionality.

How do I correctly configure celery beat? How can I debug this?

like image 672
Snow Avatar asked Apr 20 '20 20:04

Snow


People also ask

What is celery in superset?

Async Queries via Celery​a celery broker (message queue) for which we recommend using Redis or RabbitMQ. a results backend that defines where the worker will persist the query results.

How do I know if superset is running?

If you are running superset behind a load balancer or reverse proxy (e.g. NGINX or ELB on AWS), you may need to utilize a healthcheck endpoint so that your load balancer knows if your superset instance is running. This is provided at /health which will return a 200 response containing “OK” if the webserver is running.

Where is superset config PY?

Superset looks in the path for a file called superset_config.py there. You can also directly point to the file even if it is not in the path when you set the environment variable SUPERSET_CONFIG_PATH=/your/path/to/superset_config.py . In the sources there's a file called config.py that has all settings.


1 Answers

I managed to solve it by altering the CeleryConfig implementation, and adding a beat service to 'docker-compose.yml'

New CeleryConfig class in 'superset_config.py':

REDIS_HOST = get_env_variable("REDIS_HOST")
REDIS_PORT = get_env_variable("REDIS_PORT")

class CeleryConfig(object):
    BROKER_URL = "redis://%s:%s/0" % (REDIS_HOST, REDIS_PORT)
    CELERY_IMPORTS = (
        'superset.sql_lab',
        'superset.tasks',
    )
    CELERY_RESULT_BACKEND = "redis://%s:%s/1" % (REDIS_HOST, REDIS_PORT)
    CELERY_ANNOTATIONS = {
        'sql_lab.get_sql_results': {
            'rate_limit': '100/s',
        },
        'email_reports.send': {
            'rate_limit': '1/s',
            'time_limit': 120,
            'soft_time_limit': 150,
            'ignore_result': True,
        },
    }
    CELERY_TASK_PROTOCOL = 1
    CELERYBEAT_SCHEDULE = {
        'email_reports.schedule_hourly': {
            'task': 'email_reports.schedule_hourly',
            'schedule': crontab(minute='1', hour='*'),
        },
    }

Changes in 'docker-compose.yml':

  superset-worker:
    build: *superset-build
    command: ["celery", "worker", "--app=superset.tasks.celery_app:app", "-Ofair"]
    env_file: docker/.env
    restart: unless-stopped
    depends_on: *superset-depends-on
    volumes: *superset-volumes

  superset-beat:
    build: *superset-build
    command: ["celery", "beat", "--app=superset.tasks.celery_app:app", "--pidfile=", "-f", "/app/celery_beat.log"]
    env_file: docker/.env
    restart: unless-stopped
    depends_on: *superset-depends-on
    volumes: *superset-volumes
like image 65
Snow Avatar answered Oct 21 '22 11:10

Snow