Is it possible to run an R script as an airflow dag? I have tried looking online for documentation on this and am unable to do so. Thanks
As stated here, there are different ways to run R scripts in Airflow and maybe the best way is to containerize your R script and run it using the DockerOperator, which is included in the standard distribution.
What is Airflow Used For? Apache Airflow is used for the scheduling and orchestration of data pipelines or workflows. Orchestration of data pipelines refers to the sequencing, coordination, scheduling, and managing complex data pipelines from diverse sources.
Airflow is not a data streaming solution. Tasks do not move data from one to the other (though tasks can exchange metadata!). Airflow is not in the Spark Streaming or Storm space, it is more comparable to Oozie or Azkaban. Workflows are expected to be mostly static or slowly changing.
Airflow is free and open source, licensed under Apache License 2.0.
There doesn't seem to be a R Operator right now.
You could either write your own and contribute to the community or simply run your task as a BashOperator calling RScript.
Another option is to containerize your R script and run it using the DockerOperator, which is included in the standard distribution. This removes the need to have your worker nodes configured with the correct version of R and any needed R libraries.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With