Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is ray `num_cpus` used to actually allocate CPUs?

Tags:

python

ray

When using the ray framework, there is an option to select the number of CPUs required for this task, as explained here.

Ex:

@ray.remote(num_cpus=4)
def f():
    return 1

However this is unclear whether there is going to be actual CPU allocation:

  1. The function will litteraly be allocated 4CPUs (using for example the CPU affinity, like in the taskset linux command, or the cpuset docker argument)
  2. Or the scheduler will use this num_cpus only internally, as scheduling metadata. For ex to decide whether he can start a new task requiring 16 cpus, where there are only 10 left. The task will still have access to all the CPUs and can 'use' more CPU time than requested in num_cpus

The option 2 seems more likely, but this is not stated in the documentation. And additionally, there seems to be a kind of option 1 for the GPUs, which makes the intentions of the scheduler unclear:

Ray will automatically set the environment variable CUDA_VISIBLE_DEVICES for that process.

The process is configured to use a certain GPU (but can bypass it, by resetting CUDA_VISIBLE_DEVICES)

So, how is used num_cpus in ray?

like image 460
Phylliade Avatar asked Jul 10 '19 12:07

Phylliade


1 Answers

Good question - for CPUs, the allocation is only used as metadata (option 2). For GPUs, the allocation is both used as metadata and also provides isolation. The docs will be updated very soon (and will update the answer afterwards).

like image 94
richliaw Avatar answered Nov 14 '22 21:11

richliaw