Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The role of "yarn.scheduler.minimum-allocation-vcores" and yarn.scheduler.maximum-allocation-vcores" in deciding the no. of containers/node?

I am actually trying to figure out how many containers are there in a single node manager. On what factors does it depend? And what is the role of "yarn.scheduler.minimum-allocation-vcores" and "yarn.scheduler.maximum-allocation-vcores" in deciding the number of containers per node?

like image 207
Yasho Sagar Avatar asked Oct 20 '22 14:10

Yasho Sagar


1 Answers

Default resource scheduler in yarn is Capacity Scheduler.

Capacity Scheduler has two resource calculators

  1. DefaultResourceCalculator (default)

  2. DominantResourceCalculator

DefaultResourceCalculator uses only memory to compute available container

public int computeAvailableContainers(Resource available, Resource required) {
// Only consider memory
return available.getMemory() / required.getMemory();
  }

DominantResourceCalculator uses both memory and cores

  public int computeAvailableContainers(Resource available, Resource required) {
    return Math.min(
        available.getMemory() / required.getMemory(), 
        available.getVirtualCores() / required.getVirtualCores());
  }

yarn.scheduler.minimum-allocation-vcores and yarn.scheduler.maximum-allocation-vcores don't play any direct role in deciding number of container per node.

While requesting for resources application tells yarn memory and cores it require per container.

In mapreduce we specify vcores needed by mapreduce.map.cpu.vcores and mapreduce.reduce.cpu.vcores

In spark we specify vcores needed by spark.executor.cores

yarn.scheduler.minimum-allocation-vcores and yarn.scheduler.maximum-allocation-vcores are used to define min and max number of vcores per container that can be allocated.

like image 83
banjara Avatar answered Nov 04 '22 19:11

banjara