Hadoop speculative task execution

Tags:

In Google's MapReduce paper, they have a backup task, I think it's the same thing with speculative task in Hadoop. How is the speculative task implemented? When I start a speculative task, does the task start from the very begining as the older and slowly one, or just start from where the older task has reached(if so, does it have to copy all the intermediate status and data?)

721

asked Mar 01 '13 18:03

lil

1 Answers

One problem with the Hadoop system is that by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program.

Tasks may be slow for various reasons, including hardware degradation, or software mis-configuration, but the causes may be hard to detect since the tasks still complete successfully, albeit after a longer time than expected. Hadoop doesn’t try to diagnose and fix slow-running tasks; instead, it tries to detect when a task is running slower than expected and launches another, equivalent, task as a backup. This is termed speculative execution of tasks.

For example if one node has a slow disk controller, then it may be reading its input at only 10% the speed of all the other nodes. So when 99 map tasks are already complete, the system is still waiting for the final map task to check in, which takes much longer than all the other nodes.

By forcing tasks to run in isolation from one another, individual tasks do not know where their inputs come from. Tasks trust the Hadoop platform to just deliver the appropriate input. Therefore, the same input can be processed multiple times in parallel, to exploit differences in machine capabilities. As most of the tasks in a job are coming to a close, the Hadoop platform will schedule redundant copies of the remaining tasks across several nodes which do not have other work to perform. This process is known as speculative execution. When tasks complete, they announce this fact to the JobTracker. Whichever copy of a task finishes first becomes the definitive copy. If other copies were executing speculatively, Hadoop tells the TaskTrackers to abandon the tasks and discard their outputs. The Reducers then receive their inputs from whichever Mapper completed successfully, first.

Speculative execution is enabled by default. You can disable speculative execution for the mappers and reducers by setting the mapred.map.tasks.speculative.execution and mapred.reduce.tasks.speculative.execution JobConf options to false, respectively using old API, while with newer API you may consider changing mapreduce.map.speculative and mapreduce.reduce.speculative.

So to answer your question it does start afresh and has nothing to do with how much the other task has done/completed.

Reference: http://developer.yahoo.com/hadoop/tutorial/module4.html

170

answered Nov 03 '22 08:11

Amar

Related questions
                            
                                What is a keytab exactly?
                            
                                How to Define Custom partitioner for Spark RDDs of equally sized partition where each partition has equal number of elements?
                            
                                How do I run graphx with Python / pyspark?
                            
                                What is hive, Is it a database? [closed]
                            
                                Set hadoop system user for client embedded in Java webapp
                            
                                hdfs dfs -mkdir, No such file or directory
                            
                                How to load a text file into a Hive table stored as sequence files
                            
                                $HADOOP_HOME is deprecated
                            
                                Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database
                            
                                Apache Hadoop Yarn - Underutilization of cores
                            
                                What is the purpose of "uber mode" in hadoop?
                            
                                Find port number where HDFS is listening
                            
                                Is there an equivalent to `pwd` in hdfs?
                            
                                how to replace characters in hive?
                            
                                Pyspark: get list of files/directories on HDFS path
                            
                                No such method exception Hadoop <init>
                            
                                Accessing stream output from hdfs of MRjob
                            
                                Add a column in a table in HIVE QL
                            
                                Difference between `hadoop dfs` and `hadoop fs` [closed]
                            
                                How to convert .txt file to Hadoop's sequence file format

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Hadoop speculative task execution

Tags:

hadoop

mapreduce

lil

People also ask

1 Answers

Amar

Recent Activity

Donate For Us