hadoop map reduce taking forever to complete

Tags:

I am new to the world of map reduce, I have run a job and it seems to be taking forever to complete given that it is a relatively small task, I am guessing something has not gone according to plan. I am using hadoop version 2.6, here is some info gathered I thought could help. The map reduce programs themselves are straightforward so I won't bother adding those here unless someone really wants me to give more insight - the python code running for map reduce is identical to the one here - http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/. If someone can give a clue as to what has gone wrong or why that would be great. Thanks in advance.

Name:   streamjob1669011192523346656.jar
Application Type:   MAPREDUCE
Application Tags:   
State:  ACCEPTED
FinalStatus:    UNDEFINED
Started:    3-Jul-2015 00:17:10
Elapsed:    20mins, 57sec
Tracking URL:   UNASSIGNED
Diagnostics:

this is what I get when running the program:

bin/hadoop jar share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -  file python-files/mapper.py -mapper python-files/mapper.py -file python -    files/reducer.py -reducer python-files/reducer.py -input /user/input/* -  output /user/output
15/07/03 00:16:41 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
2015-07-03 00:16:43.510 java[3708:1b03] Unable to load realm info from   SCDynamicStore
15/07/03 00:16:44 WARN util.NativeCodeLoader: Unable to load native-   hadoop library for your platform... using builtin-java classes where     applicable
packageJobJar: [python-files/mapper.py, python-files/reducer.py,     /var/folders/4x/v16lrvy91ld4t8rqvnzbr83m0000gn/T/hadoop-unjar8212926403009053963/] []     /var/folders/4x/v16lrvy91ld4t8rqvnzbr83m0000gn/T/streamjob1669011192523346656.jar tmpDir=null
15/07/03 00:16:53 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/07/03 00:16:55 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/07/03 00:17:05 INFO mapred.FileInputFormat: Total input paths to    process : 1
15/07/03 00:17:06 INFO mapreduce.JobSubmitter: number of splits:2
15/07/03 00:17:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1435852353333_0003
15/07/03 00:17:11 INFO impl.YarnClientImpl: Submitted application application_1435852353333_0003
15/07/03 00:17:11 INFO mapreduce.Job: The url to track the job:     http://mymacbook.home:8088/proxy/application_1435852353333_0003/
15/07/03 00:17:11 INFO mapreduce.Job: Running job: job_1435852353333_0003

697

asked Jul 02 '15 23:07

godzilla

1 Answers

If a job is in ACCEPTED state for long time and not changing to RUNNING state, It could be due to the following reasons.

Nodemanager(slave service) is either dead or unable to communicate with resource manager. if the Active nodes in the Yarn resource manager Web ui main page is zero then you can confirm no node managers are connected to resource manager. If so, you need to start nodemanager.

Another reason is there might be other jobs running which occupies the available slot and no room for new jobs check the value of Memory Total, Memory used ,Vcores Total ,VCores Used in the resource manager webui main page.

answered Oct 23 '22 12:10

SachinJ

Related questions
                            
                                Change Time Unit with Kernprof
                            
                                What happens to a Celery Worker's scheduled (eta) tasks when it shuts down?
                            
                                Can I have logging.ini file without root logger?
                            
                                python module __init__ function
                            
                                How can I get Sqlalchemy to preserve column order in the sql it generates?
                            
                                Python - Scikit find variable importance for categorical variables
                            
                                Wrap C++ Class with cython, getting the basic example to work
                            
                                How get a (x,y) position pointing with mouse in a interactive plot (Python)?
                            
                                urllib2.URLError: <urlopen error Tunnel connection failed: 403 Tunnel or SSL Forbidden>
                            
                                How to check if GPU memory is available using PyOpenCL
                            
                                Computing MAD(mean absolute deviation) GroupBy Pandas
                            
                                How to know if threading.Condition.wait(timeout) has timed out or has been notified?
                            
                                Categorical & Numerical Features - Categorical Target - Scikit Learn - Python
                            
                                Do I need to add a constant when using sm.OLS?
                            
                                Python using system SSL certificates?
                            
                                Scipy's correlate function is slow
                            
                                Using Pandas how do I deduplicate a file being read in chunks?
                            
                                Create staff user in Django
                            
                                Do I need to explicitly check for __name__ == "__main__" before calling getLogger in Python?
                            
                                how to solve "Process terminated because the request deadline was exceeded. (Error code 123)" in google api?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

hadoop map reduce taking forever to complete

Tags:

python

hadoop

godzilla

People also ask

1 Answers

SachinJ

Recent Activity

Donate For Us