Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Customize task resources on Airflow using MesosExecutor

Tags:

airflow

mesos

Is it possible to specify resources (CPU, memory, GPU, disk space) for each operator of a DAG when using MesosExecutor?

I know you can specify global values for resources of a task.

For instance, I have several operators that are CPU expensive and others that not. I would like to execute one at a time of the first, but many in parallel of the non CPU expensive ones.

like image 877
Gonzalo Matheu Avatar asked May 10 '18 00:05

Gonzalo Matheu


1 Answers

From the code (mesos_executor.py line 67), it seems that is not possible since cpu and memory values are passed to the Scheduler during initialization:

    def __init__(self,
             task_queue,
             result_queue,
             task_cpu=1,
             task_mem=256):
    self.task_queue = task_queue
    self.result_queue = result_queue
    self.task_cpu = task_cpu
    self.task_mem = task_mem

and those values are used without modification:

cpus = task.resources.add()
            cpus.name = "cpus"
            cpus.type = mesos_pb2.Value.SCALAR
            cpus.scalar.value = self.task_cpu

            mem = task.resources.add()
            mem.name = "mem"
            mem.type = mesos_pb2.Value.SCALAR
            mem.scalar.value = self.task_mem

It requires a custom Executor implementation to achieve that

like image 159
Gonzalo Matheu Avatar answered Nov 02 '22 18:11

Gonzalo Matheu