Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregate Resource Allocation for a job in YARN

I am new to Hadoop. When i run a job, i see the aggregate resource allocation for that job as 251248654 MB-seconds, 24462 vcore-seconds. However, when i find the details about the cluster, it shows there are 888 Vcores-total and 15.90 TB Memory-total. Can anyone tell me how this is related? what does MB-second and Vcore-seconds refer to for the job.

Is there any material online to know these? I tried surfing, dint get a proper answer

like image 224
blackfury Avatar asked Nov 23 '15 08:11

blackfury


People also ask

How do you allocate resources to yarn?

You can manage your cluster capacity using the Capacity Scheduler in YARN. You can use use the Capacity Scheduler's DefaultResourceCalculator or the DominantResourceCalculator to allocate available resources. The fundamental unit of scheduling in YARN is the queue.

What is MB seconds?

Megabytes per second generally refers to upload and download speeds. Each byte is made up of 8 bits. A megabyte is made up of 1,000,000 bytes. Megabits per second measures the file size of data transferred per second over a channel. A megabit is the equivalent of 125 kilobytes (KBs) or 125,000 bytes.


1 Answers

VCores-Total: Indicates the total number of VCores available in the cluster
Memory-Total: Indicates the total memory available in the cluster.

For e.g. I have a single node cluster, where, I have set memory requirements per container to be: 1228 MB (determined by config: yarn.scheduler.minimum-allocation-mb) and vCores per container to 1 vCore (determined by config: yarn.scheduler.minimum-allocation-vcores).

I have set: yarn.nodemanager.resource.memory-mb to 9830 MB. So, there can be totally 8 containers per node (9830 / 1228 = 8).

So, for my cluster:

VCores-Total = 1 (node) * 8 (containers) * 1 (vCore per container) = 8 
Memory-Total = 1 (node) * 8 (containers) * 1228 MB (memory per container) = 9824 MB = 9.59375 GB = 9.6 GB

The figure below, shows my cluster metrics: enter image description here

Now let's see "MB-seconds" and "vcore-seconds". As per the description in the code (ApplicationResourceUsageReport.java):

MB-seconds: The aggregated amount of memory (in megabytes) the application has allocated times the number of seconds the application has been running.

vcore-seconds: The aggregated number of vcores that the application has allocated times the number of seconds the application has been running.

The description is self-explanatory (remember the keyword: Aggregated).

Let me explain this with an example. I ran a DistCp job (which spawned 25 containers), for which I got the following:

Aggregate Resource Allocation: 10361661 MB-seconds, 8424 vcore-seconds

Now, let's do some rough calculation on how much time each container took:

For memory:
10361661 MB-seconds = 10361661 / 25 (containers) / 1228 MB (memory per container) = 337.51 seconds = 5.62 minutes

For CPU
8424 vcore-seconds = 8424 / 25 (containers) / 1 (vCore per container) = 336.96 seconds = 5.616 minutes

This indicates on an average, each container took 5.62 minutes to execute.

I hope this makes it clear. You can execute a job and confirm it yourself.

like image 69
Manjunath Ballur Avatar answered Sep 22 '22 18:09

Manjunath Ballur