I'm new to airflow and have a question about Airflow and its processors. When a processor produces an output, how this output is moved in input to the next processor ? There is a software called nifi which stores intermediate outputs into flowfiles, afaik there is nothing like this in airflow. So how does this happen?
Thanks in advance.
Airflow uses Xcoms to pass data between operators.
If the flow is operator A -> operator B, then operator A must "push" a value to xcom, and operator B must "pull" this value from A if it wants to read it.
Any operators downstream from A have access to any values A pushed to Xcom via:
value = context['task_instance'].xcom_pull(task_ids='operator_a', key='key_name')
And operator A would push this value like this:
context['task_instance'].xcom_push(key_name,value,context['execution_date'])
Maybe you are referring to the GenericTransfer operator, that helps moving data between data soruces?
https://github.com/apache/incubator-airflow/blob/master/airflow/operators/generic_transfer.py
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With