I'm currently using airflow on Amazon Web services using EC2 instances. The big issue is that the average usage of the instances are about 2%...
I'd like to use a scalable architecture and creating instances only for the duration of the job and kill it. I saw on the roadmap that AWS BATCH was suppose to be an executor in 2017 but no new about that.
Do you know if it possible to use AWS BATCH as an executor for all airflow jobs ?
Regards, Romain.
There is no executor, but an operator is available from version 1.10. After you create an Execution Environment, Job Queue and Job Definition on AWS Batch, you can use the AWSBatchOperator
to trigger Jobs.
Here is the source code.
Currently there is a SequentialExecutor, a LocalExecutor, a DaskExecutor, a CeleryExecutor and a MesosExecutor. I heard they're working on AIRFLOW-1899 targeted for 2.0 to introduce a KubernetesExecutor. So, looking at Dask and Celery it doesn't seem they support a mode where their workers are created per task. Mesos might, Kubernetes should, but then you'd have to scale the clusters for the workers accordingly to account for turning off the nodes when un-needed.
We did a little work to get a cloud formation setup where celery workers scale out and in based on metrics from cloud-watch of the average cpu load across the tagged workers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With