Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Schedule tasks on another computer with Airflow

Tags:

airflow

I successfully set up Airflow with a Postgres database on an Ubuntu remote server, and it seems great.

I was able to connect to my data warehouse (a separate server) and easily issue queries as tasks. This was simple because the server with Airflow installed was actually issuing the query.

Since I am just testing Airflow for now, it is installed on a fairly small and low-powered server. Is there a way for me to schedule tasks to run on my beefy Windows desktop? Or what is the best approach to utilize my local machines to download data/process files, and still have Airflow know that the task was completed successfully?

like image 287
trench Avatar asked Mar 07 '17 12:03

trench


1 Answers

Airflow is designed to support distribution of workload. If you run airflow workers that do the bulk of the data processing on your Windows machine then you can use their compute power while run your airflow scheduler and airflow webserver on your smaller machine as that is just keep triggering new tasks, checking heartbeat and updating task status. For this setup to work, you will have to use CeleryExecutor. I found this blog useful when I did my first setup

like image 172
nehiljain Avatar answered Oct 17 '22 12:10

nehiljain