Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference failed tasks vs killed tasks

From Jobtracker web UI, I see this column called "Failed/Killed Task Attempts".

I would like to know the distinction between them. I guess "Failed ones" mean tasks that really failed eventually after some retries (so no recovery was done at all?) while "Killed ones" mean tasks which are killed (due to timeout and so on) but they might be retried?

like image 614
kee Avatar asked Jun 22 '12 23:06

kee


1 Answers

There are a few reasons Hadoop can kill tasks by his own decisions:
a) Task does not report progress during timeout (default is 10 minutes)
b) FairScheduler or CapacityScheduler needs the slot for some other pool (FairScheduler) or queue (CapacityScheduler).
c) Speculative execution causes results of task not to be needed since it has completed on other place.

like image 95
David Gruzman Avatar answered Oct 08 '22 05:10

David Gruzman