Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow latency between tasks

As you can see in the image : DAG latency between tasks] airflow is making too much time between tasks execution ? it almost represents 30% of the DAG execution time. I've changed the airflow.cfg file to:

job_heartbeat_sec = 1 
scheduler_heartbeat_sec = 1

but I still have the same latency rate.

Why does it behave this way ?

like image 805
I.Chorfi Avatar asked Apr 18 '18 14:04

I.Chorfi


2 Answers

Thirty seconds is fairly high for inter-task latency. In well-tuned environments I've seen, ~4-6 seconds between a task and a dependent task has been a fairly reasonable lower bound, even for environments with many thousands of DAGs.

As you've already stated, increasing the scheduler heartbeat (scheduler_heartbeat_sec) and the number of threads the scheduler has (scheduler.max_threads) are the best to decrease scheduling delays. If your tasks are blocked on other conditions (which you can check in logs; core.logging_level = DEBUG for even more information), then you should resolve those first.

If you've adjusted both the scheduler heartbeat and the number of worker threads and you still see high scheduling delays, then you may need to consider using a more powerful machine.

like image 137
hexacyanide Avatar answered Nov 10 '22 16:11

hexacyanide


It is by design. For instance I use Airflow to perform large workflows where some tasks can take a really long time. Airflow is not meant for tasks that will take seconds to execute, it can be used for that of course but might not be the most suitable tool.

With that said there is not much that you can do since you already found out the key settings to configure.

Additionally you might want to try to increase the number of threads of the scheduler:

   [scheduler]
   max_threads = 4

This can alternatively be done by setting the environment variable:

AIRFLOW__SCHEDULER__MAX_THREADS=4

However do not count on the latency to decrease that much.

like image 15
Hito Avatar answered Nov 10 '22 16:11

Hito