Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to integrate Airflow with Github for running scripts

If we maintain our code/scripts in github repository account, is there any way to copy these scripts from Github repository and execute on some other cluster ( which can be Hadoop or Spark).

Does airflow provides any operator to connect to Github for fetching such files ?

Maintaining scripts in Github will provide more flexibility as every change in the code will be reflected and used directly from there.

Any idea on this scenario will really help.

like image 643
AshishPatil Avatar asked Nov 21 '18 06:11

AshishPatil


1 Answers

You can use GitPython as part of a PythonOperator task to run the pull as per a specified schedule.

import git 

g = git.cmd.Git( git_dir )
g.pull()

Don't forget to make sure that you have added the relevant keys so that the airflow workers have permission to pull the data.

like image 163
Meghdeep Ray Avatar answered Oct 20 '22 13:10

Meghdeep Ray