I am using Hadoop-2.4.0
and my system configs are 24 cores, 96 GB RAM.
I am using following configs
mapreduce.map.cpu.vcores=1
yarn.nodemanager.resource.cpu-vcores=10
yarn.scheduler.minimum-allocation-vcores=1
yarn.scheduler.maximum-allocation-vcores=4
yarn.app.mapreduce.am.resource.cpu-vcores=1
yarn.nodemanager.resource.memory-mb=88064
mapreduce.map.memory.mb=3072
mapreduce.map.java.opts=-Xmx2048m
Capacity Scheduler configs
queue.default.capacity=50
queue.default.maximum_capacity=100
yarn.scheduler.capacity.root.default.user-limit-factor=2
With above configs, I expect yarn won't launch more than 10 mappers per node, but It is launching 28 mappers per node. Am I doing something wrong??
“yarn. nodemanager. resource. cpu-vcores” controls how many vcores can be scheduled on a particular NodeManager instance.
A vcore is a share of host CPU that the YARN Node Manager allocates to available resources. yarn. scheduler. maximum-allocation-vcores is the maximum allocation for each container request at the Resource Manager, in terms of virtual CPU cores.
YARN is running more containers than allocated cores because by default DefaultResourceCalculator is used. It considers only memory.
public int computeAvailableContainers(Resource available, Resource required) {
// Only consider memory
return available.getMemory() / required.getMemory();
}
Use DominantResourceCalculator, It uses both cpu and memory.
Set below config in capacity-scheduler.xml
yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
More about DominantResourceCalculator
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With