I tried to run simple word count as MapReduce job. Everything works fine when run locally (all work done on Name Node). But, when I try to run it on a cluster using YARN (adding mapreduce.framework.name
=yarn
to mapred-site.conf) job hangs.
I came across a similar problem here: MapReduce jobs get stuck in Accepted state
Output from job:
*** START ***
15/12/25 17:52:50 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/12/25 17:52:51 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/12/25 17:52:51 INFO input.FileInputFormat: Total input paths to process : 5
15/12/25 17:52:52 INFO mapreduce.JobSubmitter: number of splits:5
15/12/25 17:52:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1451083949804_0001
15/12/25 17:52:53 INFO impl.YarnClientImpl: Submitted application application_1451083949804_0001
15/12/25 17:52:53 INFO mapreduce.Job: The url to track the job: http://hadoop-droplet:8088/proxy/application_1451083949804_0001/
15/12/25 17:52:53 INFO mapreduce.Job: Running job: job_1451083949804_0001
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.job.tracker</name>
<value>localhost:54311</value>
</property>
<!--
<property>
<name>mapreduce.job.tracker.reserved.physicalmemory.mb</name>
<value></value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>1024</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>3000</value>
<source>mapred-site.xml</source>
</property> -->
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<!--
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>3000</value>
<source>yarn-site.xml</source>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>500</value>
</property>
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>3000</value>
</property>
-->
</configuration>
//I the left commented options - they were not solving the problem
YarnApplicationState: ACCEPTED: waiting for AM container to be allocated, launched and register with RM.
What can be the problem?
EDIT:
I tried this configuration (commented) on machines: NameNode(8GB RAM) + 2x DataNode (4GB RAM). I get the same effect: Job hangs on ACCEPTED state.
EDIT2: changed configuration (thanks @Manjunath Ballur) to:
yarn-site.xml:
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-droplet</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop-droplet:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop-droplet:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop-droplet:8030</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop-droplet:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop-droplet:8088</value>
</property>
<property>
<description>Classpath for typical applications.</description>
<name>yarn.application.classpath</name>
<value>
$HADOOP_CONF_DIR,
$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,
$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,
$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,
$YARN_HOME/*,$YARN_HOME/lib/*
</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/data/1/yarn/local,/data/2/yarn/local,/data/3/yarn/local</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/data/1/yarn/logs,/data/2/yarn/logs,/data/3/yarn/logs</value>
</property>
<property>
<description>Where to aggregate logs</description>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/var/log/hadoop-yarn/apps</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>50</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>390</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>390</value>
</property>
</configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>50</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx40m</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>50</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>50</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx40m</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx40m</value>
</property>
</configuration>
Still not working. Additional info: I can see no nodes on cluster preview (similar problem here: Slave nodes not in Yarn ResourceManager )
You should check the status of Node managers in your cluster. If the NM nodes are short on disk space then RM will mark them "unhealthy" and those NMs can't allocate new containers.
1) Check the Unhealthy nodes: http://<active_RM>:8088/cluster/nodes/unhealthy
If the "health report" tab says "local-dirs are bad" then it means you need to cleanup some disk space from these nodes.
2) Check the DFS dfs.data.dir
property in hdfs-site.xml
. It points the location on local file system where hdfs data is stored.
3) Login to those machines and use df -h
& hadoop fs - du -h
commands to measure the space occupied.
4) Verify hadoop trash and delete it if it's blocking you.
hadoop fs -du -h /user/user_name/.Trash
and hadoop fs -rm -r /user/user_name/.Trash/*
I feel, you are getting your memory settings wrong.
To understand the tuning of YARN configuration, I found this to be a very good source: http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_ig_yarn_tuning.html
I followed the instructions given in this blog and was able to get my jobs running. You should alter your settings proportional to the physical memory you have on your nodes.
Key things to remember is:
mapreduce.map.memory.mb
and mapreduce.reduce.memory.mb
should be at least yarn.scheduler.minimum-allocation-mb
mapreduce.map.java.opts
and mapreduce.reduce.java.opts
should be around "0.8 times the value of" corresponding mapreduce.map.memory.mb
and mapreduce.reduce.memory.mb
configurations. (In my case it is 983 MB ~ (0.8 * 1228 MB))yarn.app.mapreduce.am.command-opts
should be "0.8 times the value of" yarn.app.mapreduce.am.resource.mb
Following are the settings I use and they work perfectly for me:
yarn-site.xml:
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1228</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>9830</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>9830</value>
</property>
mapred-site.xml
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1228</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx983m</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>1228</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>1228</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx983m</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx983m</value>
</property>
You can also refer to the answer here: Yarn container understanding and tuning
You can add vCore settings, if you want your container allocation to take into account CPU also. But, for this to work, you need to use CapacityScheduler
with DominantResourceCalculator
. See the discussion about this here: How are containers created based on vcores and memory in MapReduce2?
This has solved my case for this error:
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>100</value>
</property>
Check your hosts file on master and slave nodes. I had exactly this problem. My hosts file looked like this on master node for example
127.0.0.0 localhost
127.0.1.1 master-virtualbox
192.168.15.101 master
I changed it like below
192.168.15.101 master master-virtualbox localhost
So it worked.
These lines
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
<value>100</value>
</property>
in the yarn-site.xml
solved my problem since the node will be marked as unhealthy when disk usage is >=95%. Solution mainly suitable for pseudodistributed mode.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With