Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between 3 memory parameters in Hadoop 2?

I'm using Hadoop 2.0.5 (Alpha) to run relatively big jobs, and I've ran into these errors:

Container [pid=15023,containerID=container_1378641992707_0002_01_000029] is running beyond virtual memory limits. Current usage: 492.4 MB of 1 GB physical memory used; 3.3 GB of 2.1 GB virtual memory used. Killing container.

I then learned about these two parameters:

yarn.nodemanager.vmem-pmem-ratio property which is set to 2.1 by default.

yarn.app.mapreduce.am.command-opts which is set to -Xmx1024mb (=1GB) by default.

That explained the limits marked above.

Setting these parameters to a higher value did help, but then I found this parameter: yarn.app.mapreduce.am.resource.mb which is set to 1536 by default.

And I can't quite tell the difference between the 3 from the description given in Hadoop's default XMLs, nor how should I properly set them in means of optimization.

An explanation or a good reference would be much appreciated

like image 824
itzhaki Avatar asked Sep 09 '13 06:09

itzhaki


People also ask

How does yarn allocate memory?

YARN uses the MB of memory and virtual cores per node to allocate and track resource usage. For example, a 5 node cluster with 12 GB of memory allocated per node for YARN has a total memory capacity of 60GB. For a default 2GB container size, YARN has room to allocate 30 containers of 2GB each.

What is yarn memory?

The job execution system in Hadoop is called YARN. This is a container based system used to make launching work on a Hadoop cluster a generic scheduling process. Yarn orchestrates the flow of jobs via containers as a generic unit of work to be placed on nodes for execution.

What is yarn App MapReduce Am resource MB?

Description. yarn.app.mapreduce.am.resource.mb. Sets the memory requested for the application master container to the value in MB.


1 Answers

The answer by @twid is ambiguous. According to the official document here:

yarn.app.mapreduce.am.resource.mb specifies

"The amount of memory the MR AppMaster needs."

In other words, it specifies how much memory the container that is used to run the application master needs, this is not related to containers that is used to run mappers/reducers.

like image 102
Shuang Wu Avatar answered Oct 19 '22 05:10

Shuang Wu