Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow 1.9.0 - Long Delay Between Task Execution

I recently upgraded from v1.7.1.2 to v1.9.0 and after the upgrade I noticed that the CPU usage increased significantly. After doing some digging, I tracked it down to these two scheduler config options: min_file_process_interval (defaults to 0) and max_threads (defaults to 2).

As expected, increasing min_file_process_interval avoids the tight loop and drops cpu usage when it goes idle. But what I don't understand is why min_file_process_interval affects tasks execution?

If I set min_file_process_interval to 60s, it now waits no less than 60s between executing each task in my DAG, so if my dag has 4 sequential tasks it has now added 4 minutes to my execution time. For example:

start -> [task1] -> [task2] -> [task3] -> [task4]
        ^          ^          ^          ^
        60s        60s        60s        60s

I have Airflow setup in my test env and prod env. This is less of an issue in my prod env (although still concerning), but a big issue for my test env. After the upgrade the CPU usage is significantly higher so either I accept higher CPU usage or try to decrease it with a higher config value. However, this adds significant time to my test dags execution time.

Why does min_file_process_interval affect time between tasks after the DAG has been scheduled? Are there other config options that could solve my issue?

like image 634
flutikoff Avatar asked Apr 25 '18 01:04

flutikoff


2 Answers

The most likely cause is that there are too many python files in the dags folder, and the airflow scheduler scans the instantiated DAG too much.

It is recommended to reduce the number of dag files under scheduler and worker first. At the same time, the SCHEDULER_HEARTBEAT_SEC and MAX_THREADS values are set as large as possible.

like image 182
user7016813 Avatar answered Oct 21 '22 05:10

user7016813


Another option you might want to look into is

SCHEDULER_HEARTBEAT_SEC

This setting is usually also set to a very tight interval but could loosened up a bit. This setting in combination with

MAX_THREADS

did the trick for us. The dev machines are fast enough for re-deployment but without a hot, glowing CPU which is good.

like image 36
tobi6 Avatar answered Oct 21 '22 07:10

tobi6