In Airflow, I have been using "airflow run" and "airflow test" but don't understand fully how they are different. What are their differences?

Reading through the docs myself, I see how it can be confusing. Airflow Run will run a task instance as if you had triggered it directly through the UI. Perhaps most importantly the state will be recorded in the database and that state will be reflected in the UI as if the task had run under automatic circumstances Airflow Test will skip any dependency (task, concurrency, pool etc) checks that may otherwise occur through an automatic run and run the task without updating the database. This means that you can "test" a task multiple times and it will execute, but the state in the database will not reflect runs triggered through the test command.

Before you run <code>airflow</code> command, source the virtual env and set AIRFLOW_HOME. <code>airflow run</code> is equivalent of running the task from UI. This means task run is recorded by the instance. The status is reflected in UI. It writes the log to the Log folder per the airflow configuration. Leaves an audit trail in DB. <code>airflow test</code> lets you execute the task without any traces in the metadata DB. It does not record the status of this task instance in DB, so it will not reflect the task's status in UI. Mostly this method is used if you want to test a task multiple times and don't want to keep an audit trail in DB.

Difference between "airflow run" and "airflow test" in Airflow

2 Answers

Reading through the docs myself, I see how it can be confusing.

Airflow Run will run a task instance as if you had triggered it directly through the UI. Perhaps most importantly the state will be recorded in the database and that state will be reflected in the UI as if the task had run under automatic circumstances

Airflow Test will skip any dependency (task, concurrency, pool etc) checks that may otherwise occur through an automatic run and run the task without updating the database. This means that you can "test" a task multiple times and it will execute, but the state in the database will not reflect runs triggered through the test command.

114

answered Oct 31 '22 16:10

andscoop

Before you run airflow command, source the virtual env and set AIRFLOW_HOME. airflow run is equivalent of running the task from UI. This means task run is recorded by the instance. The status is reflected in UI. It writes the log to the Log folder per the airflow configuration. Leaves an audit trail in DB.

airflow test lets you execute the task without any traces in the metadata DB. It does not record the status of this task instance in DB, so it will not reflect the task's status in UI. Mostly this method is used if you want to test a task multiple times and don't want to keep an audit trail in DB.

answered Oct 31 '22 15:10

Ramachandran Govindan

Related questions
                            
                                Implementing Postgres Sql in Apache Airflow
                            
                                What does the colour of the 'schedule' column in the airflow UI mean?
                            
                                Apache airflow: setting catchup to False is not working
                            
                                Run parallel tasks in Apache Airflow
                            
                                How to manage Python dependencies in Airflow?
                            
                                Minimum hardware requirements for Apache Airflow cluster
                            
                                airflow TriggerDagRunOperator how to change the execution date
                            
                                How to set up multiple Dag directories in airflow
                            
                                How to trigger daily DAG run at midnight local time instead of midnight UTC time
                            
                                Airflow: Why is there a start_date for operators?
                            
                                Broken DAG: (...) No module named docker
                            
                                Pip error even Microsoft Visual C++ 14.0 is installed
                            
                                Unable to execute spark job using SparkSubmitOperator
                            
                                Apache Airflow: Delay a task for some period of time
                            
                                How to skip task in Airflow operator?
                            
                                BashOperator doen't run bash file apache airflow
                            
                                Airflow admin UI shows example dags
                            
                                Apache Airflow : airflow initdb results in "ImportError: No module named json"
                            
                                apache airflow: initdb vs resetdb
                            
                                In airflow, is there a good way to call another dag's task?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Difference between "airflow run" and "airflow test" in Airflow

Tags:

airflow

kee

People also ask

2 Answers

andscoop

Ramachandran Govindan

Recent Activity

Donate For Us