Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop: Running beyond virtual memory limits, showing huge numbers

I am running a MapReduce Pipes program, and I have set the memory limits to be as follows:

in yarn-site.xml:

<property>
            <name>yarn.nodemanager.resource.memory-mb</name>
            <value>3072</value>
</property>
<property>
            <name>yarn.scheduler.minimum-allocation-mb</name>
            <value>256</value>
</property>

In mapred-site.xml:

<property>
            <name>mapreduce.map.memory.mb</name>
            <value>512</value>
</property>
<property>
            <name>mapreduce.reduce.memory.mb</name>
            <value>512</value>
</property>
<property>
            <name>mapreduce.map.java.opts</name>
            <value>-Xmx384m</value>
</property>
<property>
            <name>mapreduce.reduce.java.opts</name>
            <value>-Xmx384m</value>
</property>

I am running currently on a single node in pseudo-distributed mode. I am getting the following error before having the container killed:

2015-04-11 12:47:49,594 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1428741438743_0001_m_000000_0: Container [pid=8140,containerID=container_1428741438743_0001_01_000002] is running beyond virtual memory limits. Current usage: 304.1 MB of 1 GB physical memory used; 1.0 TB of 2.1 GB virtual memory used. Killing container.

The main thing that concerns me is 1.0 TB of virtual memory used, the application that I am running is way far from consuming this amount of memory, it is even way far from consuming 1 GB of memory.

Does that mean that there is a memory leak in my code, or could my memory configurations just be wrong?

Thank you.

Regards,

like image 972
AbdulRahman AlHamali Avatar asked Dec 20 '22 05:12

AbdulRahman AlHamali


1 Answers

I found what the problem was: in part of my code, each of the mappers had to access a local lmdb database. When an lmdb database starts, it reserves 1 TB of virtual memory, this caused Hadoop to think that I was using this much memory while in fact I wasn't.

I solved the issue by setting yarn.nodemanager.vmem-check-enabled to false in yarn-site.xml, which prevents Hadoop from checking the virtual memory limits. Note that you shouldn't use that unless you're sure of it, because Hadoop is trying to protect you from memory leaks and similar issues by this check. I only used it because I was sure it wasn't a memory leak

like image 189
AbdulRahman AlHamali Avatar answered Apr 06 '23 23:04

AbdulRahman AlHamali