Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Batch executor with Airflow

Tags:

airflow

I'm currently using airflow on Amazon Web services using EC2 instances. The big issue is that the average usage of the instances are about 2%...

I'd like to use a scalable architecture and creating instances only for the duration of the job and kill it. I saw on the roadmap that AWS BATCH was suppose to be an executor in 2017 but no new about that.

Do you know if it possible to use AWS BATCH as an executor for all airflow jobs ?

Regards, Romain.

like image 909
romain-nio Avatar asked Jan 29 '18 11:01

romain-nio


2 Answers

There is no executor, but an operator is available from version 1.10. After you create an Execution Environment, Job Queue and Job Definition on AWS Batch, you can use the AWSBatchOperator to trigger Jobs.

Here is the source code.

like image 183
Rafael Barbosa Avatar answered Oct 19 '22 07:10

Rafael Barbosa


Currently there is a SequentialExecutor, a LocalExecutor, a DaskExecutor, a CeleryExecutor and a MesosExecutor. I heard they're working on AIRFLOW-1899 targeted for 2.0 to introduce a KubernetesExecutor. So, looking at Dask and Celery it doesn't seem they support a mode where their workers are created per task. Mesos might, Kubernetes should, but then you'd have to scale the clusters for the workers accordingly to account for turning off the nodes when un-needed.

We did a little work to get a cloud formation setup where celery workers scale out and in based on metrics from cloud-watch of the average cpu load across the tagged workers.

like image 3
dlamblin Avatar answered Oct 19 '22 07:10

dlamblin