Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between "airflow run" and "airflow test" in Airflow

Tags:

airflow

In Airflow, I have been using "airflow run" and "airflow test" but don't understand fully how they are different. What are their differences?

like image 995
kee Avatar asked Aug 03 '18 23:08

kee


People also ask

What is an Airflow test?

Airflow testing is the measurement of the movement of air through, into, or out of a component or product, or the measurement of the performance characteristics of an air moving device, such as a fan or blower.

What is Max active runs in Airflow?

There are three primary DAG-level Airflow settings that users can define in code: max_active_runs : This is the maximum number of active DAG runs allowed for the DAG in question. Once this limit is hit, the Scheduler will not create new active DAG runs.

What is DAG and task in Airflow?

In Airflow, a DAG – or a Directed Acyclic Graph – is a collection of all the tasks you want to run, organized in a way that reflects their relationships and dependencies. A DAG is defined in a Python script, which represents the DAGs structure (tasks and their dependencies) as code.


2 Answers

Reading through the docs myself, I see how it can be confusing.

Airflow Run will run a task instance as if you had triggered it directly through the UI. Perhaps most importantly the state will be recorded in the database and that state will be reflected in the UI as if the task had run under automatic circumstances

Airflow Test will skip any dependency (task, concurrency, pool etc) checks that may otherwise occur through an automatic run and run the task without updating the database. This means that you can "test" a task multiple times and it will execute, but the state in the database will not reflect runs triggered through the test command.

like image 114
andscoop Avatar answered Oct 31 '22 16:10

andscoop


Before you run airflow command, source the virtual env and set AIRFLOW_HOME. airflow run is equivalent of running the task from UI. This means task run is recorded by the instance. The status is reflected in UI. It writes the log to the Log folder per the airflow configuration. Leaves an audit trail in DB.

airflow test lets you execute the task without any traces in the metadata DB. It does not record the status of this task instance in DB, so it will not reflect the task's status in UI. Mostly this method is used if you want to test a task multiple times and don't want to keep an audit trail in DB.

like image 20
Ramachandran Govindan Avatar answered Oct 31 '22 15:10

Ramachandran Govindan