Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow and data transfer between operators

Tags:

airflow

I'm new to airflow and have a question about Airflow and its processors. When a processor produces an output, how this output is moved in input to the next processor ? There is a software called nifi which stores intermediate outputs into flowfiles, afaik there is nothing like this in airflow. So how does this happen?

Thanks in advance.

like image 670
ozw1z5rd Avatar asked Mar 13 '17 11:03

ozw1z5rd


2 Answers

Airflow uses Xcoms to pass data between operators.

If the flow is operator A -> operator B, then operator A must "push" a value to xcom, and operator B must "pull" this value from A if it wants to read it.

Any operators downstream from A have access to any values A pushed to Xcom via:

value = context['task_instance'].xcom_pull(task_ids='operator_a', key='key_name') 

And operator A would push this value like this:

context['task_instance'].xcom_push(key_name,value,context['execution_date'])
like image 75
jhnclvr Avatar answered Sep 23 '22 23:09

jhnclvr


Maybe you are referring to the GenericTransfer operator, that helps moving data between data soruces?

https://github.com/apache/incubator-airflow/blob/master/airflow/operators/generic_transfer.py

like image 45
Breathe Avatar answered Sep 24 '22 23:09

Breathe