How to fix "Task attempt_201104251139_0295_r_000006_0 failed to report status for 600 seconds."


I wrote a mapreduce job to extract some info from a dataset. The dataset is users' rating about movies. The number of users is about 250K and the number of movies is about 300k. The output of map is <user, <movie, rating>*> and <movie,<user,rating>*>. In the reducer, I will process these pairs.

But when I run the job, the mapper completes as expected, but reducer always complain that

Task attempt_* failed to report status for 600 seconds. 

I know this is due to failed to update status, so I added a call to context.progress() in my code like this:

int count = 0; while (values.hasNext()) {   if (count++ % 100 == 0) {     context.progress();   }   /*other code here*/ } 

Unfortunately, this does not help. Still many reduce tasks failed.

Here is the log:

Task attempt_201104251139_0295_r_000014_1 failed to report status for 600 seconds. Killing! 11/05/03 10:09:09 INFO mapred.JobClient: Task Id : attempt_201104251139_0295_r_000012_1, Status : FAILED Task attempt_201104251139_0295_r_000012_1 failed to report status for 600 seconds. Killing! 11/05/03 10:09:09 INFO mapred.JobClient: Task Id : attempt_201104251139_0295_r_000006_1, Status : FAILED Task attempt_201104251139_0295_r_000006_1 failed to report status for 600 seconds. Killing! 

BTW, the error happened in reduce to copy phase, the log says:

reduce > copy (28 of 31 at 26.69 MB/s) > :Lost task tracker: tracker_hadoop-56:localhost/ 

Thanks for the help.

2 Answers

The easiest way will be to set this configuration parameter:

<property>   <name>mapred.task.timeout</name>   <value>1800000</value> <!-- 30 minutes --> </property> 

in mapred-site.xml

The easiest another way is to set in your Job Configuration inside the program

 Configuration conf=new Configuration();  long milliSeconds = 1000*60*60; <default is 600000, likewise can give any value)  conf.setLong("mapred.task.timeout", milliSeconds); 

**before setting it please check inside the Job file(job.xml) file in jobtracker GUI about the correct property name whether its mapred.task.timeout or mapreduce.task.timeout . . . while running the job check in the Job file again whether that property is changed according to the setted value.

