Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Yarn - How does yarn.scheduler.capacity.root.queue-name.maximum-capacity works?

I have 4 queues under the root queue with the following configuration.

|-------------|-----------------|---------------------|-------------------|
| Queue Name  | Capacity (in %) | Max Capacity (in %) | User Limit Factor |
|-------------|-----------------|---------------------|-------------------|
| default     | 10              | 30                  | 10                |
|-------------|-----------------|---------------------|-------------------|
| thriftsvr   | 5               | 30                  | 10                |
|-------------|-----------------|---------------------|-------------------|
| stream      | 70              | 70                  | 10                |
|-------------|-----------------|---------------------|-------------------|
| batch       | 15              | 30                  | 10                |
|-------------|-----------------|---------------------|-------------------|

I have set up capacity by yarn.scheduler.capacity.root.<queue-name>.capacity and max capacity by yarn.scheduler.capacity.root.<queue-name>.maximum-capacity property.

My understanding is, above 2 properties set ABSOLUTE capacity and ABSOLUTE maximum capacity respectively. That means queue stream's 100% is equal to the 70% of cluster's total capacity and it can fill up to 100% of queue's capacity that is also the 70% of cluster's total capacity.

Now, the problem is when queue 'stream' is filled with 66.4% (i.e. when Used Capacity: 66.4% & Absolute Used Capacity: 46.5%) then new jobs are getting in the pending state which is submitted in queue 'stream' by saying "waiting for AM container to be allocated, launched and register with RM".

When I checked queue configuration on yarn UI it shows Configured Max Capacity: 70.0% & Absolute Configured Max Capacity: 70.0% but according to the configuration, queue 'stream' can be filled till Used Capacity: 100% & Absolute Used Capacity: 70% enter image description here

Any idea, why new jobs are unable to utilize the queue stream's capacity till 100%?

like image 443
Vikash Pareek Avatar asked Jun 07 '19 09:06

Vikash Pareek


1 Answers

I suspect the confusing thing here is that .capacity and .maximum-capacity properties can both be defined as either

  • relative to parent queue root's percentage (as float, e.g. 12.5)
  • absolute resource value (using resource value syntax e.g. [memory=204800,vcores=122])

If you have any further questions, please do ask.

For full reference, just read the doc: https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Queue_Properties

like image 134
mvk_il Avatar answered Oct 04 '22 07:10

mvk_il