Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running R scripts in Airflow?

Tags:

r

airflow

Is it possible to run an R script as an airflow dag? I have tried looking online for documentation on this and am unable to do so. Thanks

like image 359
alexsmith2 Avatar asked Aug 22 '17 19:08

alexsmith2


People also ask

Can Airflow Run R script?

As stated here, there are different ways to run R scripts in Airflow and maybe the best way is to containerize your R script and run it using the DockerOperator, which is included in the standard distribution.

What is Apache airflow good for?

What is Airflow Used For? Apache Airflow is used for the scheduling and orchestration of data pipelines or workflows. Orchestration of data pipelines refers to the sequencing, coordination, scheduling, and managing complex data pipelines from diverse sources.

Can we use Airflow for streaming?

Airflow is not a data streaming solution. Tasks do not move data from one to the other (though tasks can exchange metadata!). Airflow is not in the Spark Streaming or Storm space, it is more comparable to Oozie or Azkaban. Workflows are expected to be mostly static or slowly changing.

Is Apache airflow free to use?

Airflow is free and open source, licensed under Apache License 2.0.


2 Answers

There doesn't seem to be a R Operator right now.

You could either write your own and contribute to the community or simply run your task as a BashOperator calling RScript.

like image 117
Pierre Sutter Avatar answered Oct 13 '22 05:10

Pierre Sutter


Another option is to containerize your R script and run it using the DockerOperator, which is included in the standard distribution. This removes the need to have your worker nodes configured with the correct version of R and any needed R libraries.

like image 29
gcbenison Avatar answered Oct 13 '22 04:10

gcbenison