Wordcount program is stuck in hadoop-2.3.0

Tags:

mapreduce

I installed hadoop-2.3.0 and tried to run wordcount example But it starts the job and sits idle

hadoop@ubuntu:~$ $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.3.0.jar    wordcount /myprg outputfile1
14/04/30 13:20:40 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/04/30 13:20:51 INFO input.FileInputFormat: Total input paths to process : 1
14/04/30 13:20:53 INFO mapreduce.JobSubmitter: number of splits:1
14/04/30 13:21:02 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1398885280814_0004
14/04/30 13:21:07 INFO impl.YarnClientImpl: Submitted application application_1398885280814_0004
14/04/30 13:21:09 INFO mapreduce.Job: The url to track the job: http://ubuntu:8088/proxy/application_1398885280814_0004/
14/04/30 13:21:09 INFO mapreduce.Job: Running job: job_1398885280814_0004

The url to track the job: application_1398885280814_0004/ enter image description here

For previous versions I did nt get such an issue. I was able to run hadoop wordcount in previous version. I followed these steps for installing hadoop-2.3.0

Please suggest.

546

asked Apr 30 '14 20:04

2 Answers

I had the exact same situation a while back while switching to YARN. Basically there was the concept of task slots in MRv1 and containers in MRv2. Both of these differ very much in how the tasks are scheduled and run on the nodes.

The reason that your job is stuck is that it is unable to find/start a container. If you go into the full logs of Resource Manager/Application Master etc daemons, you may find that it is doing nothing after it starts to allocate a new container.

To solve the problem, you have to tweak your memory settings in yarn-site.xml and mapred-site.xml. While doing the same myself, I found this and this tutorials especially helpful. I would suggest you to try with the very basic memory settings and optimize them later on. First check with a word count example then go on to other complex ones.

190

answered Oct 07 '22 01:10

Gaurav Kumar

I was facing the same issue.I added the following property to my yarn-site.xml and it solved the issue.

 <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>Hostname-of-your-RM</value>
        <description>The hostname of the RM.</description>
    </property>

Without the resource manager host name things go awry in the multi-node set up as each node would then default to trying to find a local resource manager and would never announce its resources to the master node. So your Map Reduce execution request probably didn't find any mappers in which to run because the request was being sent to the master and the master didn't know about the slave slots.

Reference : http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide/

answered Oct 07 '22 02:10

Shash

Related questions
                            
                                How to open/stream .zip files through Spark?
                            
                                How to output multiple s3 files in Parquet
                            
                                Unable to load native hadoop library for Mac OS X
                            
                                Define tuple datas in the pig script
                            
                                How do I submit more than one job to Hadoop in a step using the Elastic MapReduce API?
                            
                                Using Hadoop for Parallel Processing rather than Big Data
                            
                                Filtering null values with pig
                            
                                What is the meaning of 'serialization.format' property of a table in hive
                            
                                How to unzip file in hadoop?
                            
                                Hive service, HiveServer2 & MetaStore service?
                            
                                Hadoop Map Reduce: Algorithms
                            
                                Hadoop and MySQL Integration
                            
                                .NET and Hadoop - What should I know / learn and what is available? [closed]
                            
                                Is there any way to download a HDFS file using WebHDFS REST API? [closed]
                            
                                How to write pyspark dataframe to HDFS and then how to read it back into dataframe?
                            
                                How to avoid OutOfMemoryException when running Hadoop?
                            
                                Installing Hbase / Hadoop on EC2 cluster
                            
                                Apache Spark EOF exception
                            
                                What is difference between Oozie workflow, coordinator and bundle
                            
                                Parallel Algorithms for Generating Prime Numbers (possibly using Hadoop's map reduce)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Wordcount program is stuck in hadoop-2.3.0

Tags:

hadoop

mapreduce

Unmesha Sreeveni U.B

People also ask

2 Answers

Gaurav Kumar

Shash

Recent Activity

Donate For Us