How much can Airflow scale?

Tags:

Has anyone reported how much they've been able to get Airflow to scale at their company? I'm looking at implementing Airflow to execute 5,000+ tasks that will each run hourly, and someday scale that up to 20,000+ tasks. In examining the scheduler it looks like that might be a bottleneck since only one instance of it can run, and I'm concerned with that many tasks the scheduler will struggle to keep up. Should I be?

489

asked Aug 28 '18 17:08

chris.mclennon

1 Answers

We run thousands of tasks a day at my company and have been using Airflow for the better part of 2 years. These dags run every 15 minutes and are generated through config files that can change at any time (fed in from a UI).

The short answer - yes, it can definitely scale to that, depending on your infrastructure. Some of the new 1.10 features should make this easier than the version of 1.8 we run that runs all those tasks. We ran this on a large Mesos/DCOS that took a good deal of fine tuning to get to a stable point.

The long answer - although it can scale to that, we've found that a better solution is multiple Airflow instances with different configurations (scheduler settings,number of workers, etc.) optimized for the types dags they are running. A set of DAGs that run long running machine learning jobs should be hosted on an Airflow instance that is different from the ones running 5 minute ETL jobs. This also makes it easier for different teams to maintain the jobs they are responsible for and makes it easier to iterate on any fine tuning that's needed.

178

answered Sep 18 '22 17:09

Viraj Parekh

Related questions
                            
                                HttpOperator or HttpHook for HTTPS in Airflow
                            
                                How to install packages in Airflow?
                            
                                How to retrieve default args in python callable
                            
                                How to integrate Airflow with Github for running scripts
                            
                                Airflow error importing DAG using plugin - Relationships can only be set between Operators
                            
                                airflow cleared tasks not getting executed
                            
                                In airflow can end user pass parameters to keys which are associated with some specific dag
                            
                                How to use airflow xcoms with MySqlOperator
                            
                                How can I get execution_date in dag?? the outside of operator?
                            
                                Odd TypeError from the airflow scheduler -- has usage of @once for scheduler interval changed in v1.9?
                            
                                Apache airflow - automation - how to run spark submit job with param
                            
                                Airflow dags and PYTHONPATH
                            
                                How to instruct airflow to backfill from most recent to oldest
                            
                                How do I add a new dag to a running airflow service?
                            
                                how to create custom operators in airflow and use them in airflow template which is running through cloud composer(in google cloud platform)
                            
                                Airflow webserver suddenly stopped starting
                            
                                Explain depends_on_past functionality
                            
                                Airflow DAG Versioning
                            
                                Apache Airflow - customize logging format
                            
                                How to get the result of an SQL query from Big Query in Airflow?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How much can Airflow scale?

Tags:

airflow

airflow-scheduler

chris.mclennon

People also ask

1 Answers

Viraj Parekh

Recent Activity

Donate For Us