I have an hadoop job with a pretty long map phase and I want other short jobs to be run in priority. For this I set the priority of my long job with hadoop job -set-priority job_id LOW.
The problem is that, for my long job, the copy phase of the reducers starts even if only 30% of my map tasks are completed.
My grid is then kind of bloked as all the reduce slots are taken by the LOW priority job. The other small jobs can do their map phases but they will never get any reducer until my long job is finihed.
Any idea? Thanks. J.
I found myself the answer to my question: there is a job conf parameter that does exactly the job:
mapred.reduce.slowstart.completed.maps=0.90
the reduce tasks only start when 90% of the maps are completed. Default value is 0.05.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With