In Airflow, I have been using "airflow run" and "airflow test" but don't understand fully how they are different. What are their differences?
Airflow testing is the measurement of the movement of air through, into, or out of a component or product, or the measurement of the performance characteristics of an air moving device, such as a fan or blower.
There are three primary DAG-level Airflow settings that users can define in code: max_active_runs : This is the maximum number of active DAG runs allowed for the DAG in question. Once this limit is hit, the Scheduler will not create new active DAG runs.
In Airflow, a DAG – or a Directed Acyclic Graph – is a collection of all the tasks you want to run, organized in a way that reflects their relationships and dependencies. A DAG is defined in a Python script, which represents the DAGs structure (tasks and their dependencies) as code.
Reading through the docs myself, I see how it can be confusing.
Airflow Run will run a task instance as if you had triggered it directly through the UI. Perhaps most importantly the state will be recorded in the database and that state will be reflected in the UI as if the task had run under automatic circumstances
Airflow Test will skip any dependency (task, concurrency, pool etc) checks that may otherwise occur through an automatic run and run the task without updating the database. This means that you can "test" a task multiple times and it will execute, but the state in the database will not reflect runs triggered through the test command.
Before you run airflow
command, source the virtual env and set AIRFLOW_HOME.
airflow run
is equivalent of running the task from UI. This means task run is recorded by the instance. The status is reflected in UI. It writes the log to the Log folder per the airflow configuration. Leaves an audit trail in DB.
airflow test
lets you execute the task without any traces in the metadata DB. It does not record the status of this task instance in DB, so it will not reflect the task's status in UI. Mostly this method is used if you want to test a task multiple times and don't want to keep an audit trail in DB.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With