We are using Airflow v 1.9.0. We have 100+ dags and the instance is really slow. The scheduler is only launching some tasks.
In order to reduce the amount of CPU usage, we want to tweak some configuration parameters, namely: min_file_process_interval
and dag_dir_list_interval
. The documentation is not really clear about the difference between the two
February 8, 2021 We've just released Apache Airflow 2.0. 1. We also released 61 updated and 2 new providers.
When creating a new DAG, you probably want to set a global start_date for your tasks. This can be done by declaring your start_date directly in the DAG() object. The first DagRun to be created will be based on the min(start_date) for all your tasks.
The start_date Airflow starts running tasks for a given interval at the end of the interval itself, so it will not start its first run until after 11:59 pm on 01-01-2022 or midnight on the following day (2nd Jan 2022).
According to the official Airflow docs, The task instances directly upstream from the task need to be in a success state. Also, if you have set depends_on_past=True, the previous task instance needs to have succeeded (except if it is the first run for that task).
min_file_process_interval
:
In cases where there are only a small number of DAG definition files, the loop could potentially process the DAG definition files many times a minute. To control the rate of DAG file processing, the
min_file_process_interval
can be set to a higher value. This parameter ensures that a DAG definition file is not processed more often than once everymin_file_process_interval
seconds.
dag_dir_list_interval
:
Since the scheduler can run indefinitely, it's necessary to periodically refresh the list of files in the DAG definition directory. The refresh interval is controlled with the
dag_dir_list_interval
configuration parameter.
Source: A Google search on both terms lead to this first result https://cwiki.apache.org/confluence/display/AIRFLOW/Scheduler+Basics
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With