Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I force my reducers (copy phase) to start only when all mappers are completed

I have an hadoop job with a pretty long map phase and I want other short jobs to be run in priority. For this I set the priority of my long job with hadoop job -set-priority job_id LOW.

The problem is that, for my long job, the copy phase of the reducers starts even if only 30% of my map tasks are completed.

My grid is then kind of bloked as all the reduce slots are taken by the LOW priority job. The other small jobs can do their map phases but they will never get any reducer until my long job is finihed.

Any idea? Thanks. J.

like image 833
user1151446 Avatar asked Jan 16 '12 08:01

user1151446


1 Answers

I found myself the answer to my question: there is a job conf parameter that does exactly the job:

mapred.reduce.slowstart.completed.maps=0.90

the reduce tasks only start when 90% of the maps are completed. Default value is 0.05.

like image 154
user1151446 Avatar answered Sep 22 '22 10:09

user1151446