Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running `airflow scheduler` launches 33 scheduler processes

Tags:

python

airflow

When using LocalExecutor with a MySQL backend, running airflow scheduler on my Centos 6 box creates 33 scheduler processes, e.g. deploy 55362 13.5 1.8 574224 73272 ? Sl 18:59 7:42 /usr/local/bin/python2.7 /usr/local/bin/airflow scheduler deploy 55372 0.0 1.5 567928 60552 ? Sl 18:59 0:00 /usr/local/bin/python2.7 /usr/local/bin/airflow scheduler deploy 55373 0.0 1.5 567928 60540 ? Sl 18:59 0:00 /usr/local/bin/python2.7 /usr/local/bin/airflow scheduler ... These are distinct from Executor processes and gunicorn master and worker processes. Running it with the SequentialExecutor (sqlite backend) just kicks off one scheduler process.
Airflow still works (DAGs are getting run), but the sheer number of these processes makes me think something is wrong.
When I run select * from job where state = 'running'; in the database, only 5 SchedulerJob rows get returned. Is this normal?

like image 518
Collin Meyers Avatar asked Mar 10 '17 23:03

Collin Meyers


1 Answers

Yes this is normal. These are scheduler processes. You can control this using below parameter in airflow.cfg

# The amount of parallelism as a setting to the executor. This defines
# the max number of task instances that should run simultaneously
# on this airflow installation
parallelism = 32

These are spawned from scheduler whose pid can be found in airflow-scheduler.pid file

so 32+1=33 processes that you are seeing.

Hope this clears out your doubt.

Cheers!

like image 79
Priyank Mehta Avatar answered Sep 20 '22 12:09

Priyank Mehta