Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AM Container is running beyond virtual memory limits

I was playing with distributed shell application (hadoop-2.0.0-cdh4.1.2). This is the error I'm receiving at the moment.

13/01/01 17:09:09 INFO distributedshell.Client: Got application report from ASM for, appId=5, clientToken=null, appDiagnostics=Application application_1357039792045_0005 failed 1 times due to AM Container for appattempt_1357039792045_0005_000001 exited with  exitCode: 143 due to: Container [pid=24845,containerID=container_1357039792045_0005_01_000001] is running beyond virtual memory limits. Current usage: 77.8mb of 512.0mb physical memory used; 1.1gb of 1.0gb virtual memory used. Killing container.
Dump of the process-tree for container_1357039792045_0005_01_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 24849 24845 24845 24845 (java) 165 12 1048494080 19590 /usr/java/bin/java -Xmx512m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 128 --num_containers 1 --priority 0 --shell_command ping --shell_args localhost --debug
|- 24845 23394 24845 24845 (bash) 0 0 108654592 315 /bin/bash -c /usr/java/bin/java -Xmx512m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 128 --num_containers 1 --priority 0 --shell_command ping --shell_args localhost --debug 1>/tmp/logs/application_1357039792045_0005/container_1357039792045_0005_01_000001/AppMaster.stdout 2>/tmp/logs/application_1357039792045_0005/container_1357039792045_0005_01_000001/AppMaster.stderr 

The interesting part is that, there seems to be no problem with the setup, since a simple ls or uname command completed successfully and the output was available in the container2 stdout.

Regarding the setup, yarn.nodenamager.vmem-pmem-ratio is 3 and the total physical memory available is 2GB, which I thinks is more than enough for example to run.

For the command in question, the "ping localhost" generated two replies, as it can be seen from the containerlogs/container_1357039792045_0005_01_000002/721917/stdout/?start=-4096.

So, what could be the problem?

like image 533
Jimson James Avatar asked Jan 01 '13 12:01

Jimson James


3 Answers

No need to change the cluster configuration. I found out that just providing the extra parameter

-Dmapreduce.map.memory.mb=4096

to distcp helped for me.

like image 128
David Ongaro Avatar answered Nov 19 '22 15:11

David Ongaro


If you are running Tez framework, it is must to set the below parameters in Tez-site.xml

tez.am.resource.memory.mb
tez.task.resource.memory.mb
tez.am.java.opts

And in Yarn-site.xml

yarn.nodemanager.resource.memory-mb
yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
yarn.nodemanager.vmem-check-enabled
yarn.nodemanager.vmem-pmem-ratio

All these parameters are mandatory to set

like image 40
Avinash Avatar answered Nov 19 '22 16:11

Avinash


From the error message, you can see that you're using more virtual memory than your current limit of 1.0gb. This can be resolved in two ways:

Disable Virtual Memory Limit Checking

YARN will simply ignore the limit; in order to do this, add this to your yarn-site.xml:

<property>
  <name>yarn.nodemanager.vmem-check-enabled</name>
  <value>false</value>
  <description>Whether virtual memory limits will be enforced for containers.</description>
</property>

The default for this setting is true.

Increase Virtual Memory to Physical Memory Ratio

In your yarn-site.xml change this to a higher value than is currently set

<property>
  <name>yarn.nodemanager.vmem-pmem-ratio</name>
  <value>5</value>
  <description>Ratio between virtual memory to physical memory when setting memory limits for containers. Container allocations are expressed in terms of physical memory, and virtual memory usage is allowed to exceed this allocation by this ratio.</description>
</property>

The default is 2.1

You could also increase the amount of physical memory you allocate to a container.

Make sure you don't forget to restart yarn after you change the config.

like image 15
seg Avatar answered Nov 19 '22 16:11

seg