I am running Superset via Docker. I enabled the Email Report feature and tried it:
However, I only receive the test email report. I don't receive any emails after.
This is my CeleryConfig in superset_config.py:
class CeleryConfig(object):
BROKER_URL = 'sqla+postgresql://superset:superset@db:5432/superset'
CELERY_IMPORTS = (
'superset.sql_lab',
'superset.tasks',
)
CELERY_RESULT_BACKEND = 'db+postgresql://superset:superset@db:5432/superset'
CELERYD_LOG_LEVEL = 'DEBUG'
CELERYD_PREFETCH_MULTIPLIER = 10
CELERY_ACKS_LATE = True
CELERY_ANNOTATIONS = {
'sql_lab.get_sql_results': {
'rate_limit': '100/s',
},
'email_reports.send': {
'rate_limit': '1/s',
'time_limit': 120,
'soft_time_limit': 150,
'ignore_result': True,
},
}
CELERYBEAT_SCHEDULE = {
'email_reports.schedule_hourly': {
'task': 'email_reports.schedule_hourly',
'schedule': crontab(minute=1, hour='*'),
},
}
The documentation says I need to run the celery worker and beat.
celery worker --app=superset.tasks.celery_app:app --pool=prefork -O fair -c 4
celery beat --app=superset.tasks.celery_app:app
I added them to the 'docker-compose.yml':
superset-worker:
build: *superset-build
command: >
sh -c "celery worker --app=superset.tasks.celery_app:app -Ofair -f /app/celery_worker.log &&
celery beat --app=superset.tasks.celery_app:app -f /app/celery_beat.log"
env_file: docker/.env
restart: unless-stopped
depends_on: *superset-depends-on
volumes: *superset-volumes
Celery Worker is indeed working when sending the first email. The log file is also visible. However, the celery beat seems to not be functioning. There is also no 'celery_beat.log' created.
If you'd like a deeper insight, here's the commit with the full implementation of the functionality.
How do I correctly configure celery beat? How can I debug this?
Async Queries via Celerya celery broker (message queue) for which we recommend using Redis or RabbitMQ. a results backend that defines where the worker will persist the query results.
If you are running superset behind a load balancer or reverse proxy (e.g. NGINX or ELB on AWS), you may need to utilize a healthcheck endpoint so that your load balancer knows if your superset instance is running. This is provided at /health which will return a 200 response containing “OK” if the webserver is running.
Superset looks in the path for a file called superset_config.py there. You can also directly point to the file even if it is not in the path when you set the environment variable SUPERSET_CONFIG_PATH=/your/path/to/superset_config.py . In the sources there's a file called config.py that has all settings.
I managed to solve it by altering the CeleryConfig
implementation, and adding a beat service to 'docker-compose.yml'
New CeleryConfig
class in 'superset_config.py':
REDIS_HOST = get_env_variable("REDIS_HOST")
REDIS_PORT = get_env_variable("REDIS_PORT")
class CeleryConfig(object):
BROKER_URL = "redis://%s:%s/0" % (REDIS_HOST, REDIS_PORT)
CELERY_IMPORTS = (
'superset.sql_lab',
'superset.tasks',
)
CELERY_RESULT_BACKEND = "redis://%s:%s/1" % (REDIS_HOST, REDIS_PORT)
CELERY_ANNOTATIONS = {
'sql_lab.get_sql_results': {
'rate_limit': '100/s',
},
'email_reports.send': {
'rate_limit': '1/s',
'time_limit': 120,
'soft_time_limit': 150,
'ignore_result': True,
},
}
CELERY_TASK_PROTOCOL = 1
CELERYBEAT_SCHEDULE = {
'email_reports.schedule_hourly': {
'task': 'email_reports.schedule_hourly',
'schedule': crontab(minute='1', hour='*'),
},
}
Changes in 'docker-compose.yml':
superset-worker:
build: *superset-build
command: ["celery", "worker", "--app=superset.tasks.celery_app:app", "-Ofair"]
env_file: docker/.env
restart: unless-stopped
depends_on: *superset-depends-on
volumes: *superset-volumes
superset-beat:
build: *superset-build
command: ["celery", "beat", "--app=superset.tasks.celery_app:app", "--pidfile=", "-f", "/app/celery_beat.log"]
env_file: docker/.env
restart: unless-stopped
depends_on: *superset-depends-on
volumes: *superset-volumes
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With