Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How >> operator defines task dependencies in Airflow?

I was going through Apache Airflow tutorial https://github.com/hgrif/airflow-tutorial and encountered this section for defining task dependencies.

with DAG('airflow_tutorial_v01',      default_args=default_args,      schedule_interval='0 * * * *',      ) as dag:  print_hello = BashOperator(task_id='print_hello',                            bash_command='echo "hello"') sleep = BashOperator(task_id='sleep',                      bash_command='sleep 5') print_world = PythonOperator(task_id='print_world',                              python_callable=print_world)   print_hello >> sleep >> print_world 

The line that confuses me is

print_hello >> sleep >> print_world 

What does >> mean in Python? I know bitwise operator, but can't relate to the code here.

like image 412
idazuwaika Avatar asked Sep 18 '18 14:09

idazuwaika


People also ask

What does >> mean in Airflow?

The ">>" is Airflow syntax for setting a task downstream of another. Diving into the incubator-airflow project repo, models.py in the airflow directory defines the behavior of much of the high level abstractions of Airflow.

What is depends on past in Airflow?

According to the official Airflow docs, The task instances directly upstream from the task need to be in a success state. Also, if you have set depends_on_past=True, the previous task instance needs to have succeeded (except if it is the first run for that task).

What are the operators used in Airflow?

Some common operators available in Airflow are: BashOperator – used to execute bash commands on the machine it runs on. PythonOperator – takes any python function as an input and calls the same (this means the function should have a specific signature as well) EmailOperator – sends emails using SMTP server configured.


1 Answers

Airflow represents workflows as directed acyclic graphs. A workflow is any number of tasks that have to be executed, either in parallel or sequentially. The ">>" is Airflow syntax for setting a task downstream of another.

Diving into the incubator-airflow project repo, models.py in the airflow directory defines the behavior of much of the high level abstractions of Airflow. You can dig into the other classes if you'd like there, but the one that answers your question is the BaseOperator class. All operators in Airflow inherit from the BaseOperator. The __rshift__ method of the BaseOperator class implements the Python right shift logical operator in the context of setting a task or a DAG downstream of another.

See implementation here.

like image 89
rob Avatar answered Sep 16 '22 19:09

rob