how to restrict the concurrent running map tasks?

Question

My hadoop version is 1.0.2. Now I want at most 10 map tasks running at the same time. I have found 2 variable related to this question.

a) mapred.job.map.capacity

but in my hadoop version, this parameter seems abandoned.

b) mapred.jobtracker.taskScheduler.maxRunningTasksPerJob (http://grepcode.com/file/repo1.maven.org/maven2/com.ning/metrics.collector/1.0.2/mapred-default.xml)

I set this variable like below:

Configuration conf = new Configuration();
conf.set("date", date);
conf.set("mapred.job.queue.name", "hadoop");
conf.set("mapred.jobtracker.taskScheduler.maxRunningTasksPerJob", "10");

DistributedCache.createSymlink(conf);
Job job = new Job(conf, "ConstructApkDownload_" + date);
...

The problem is that it doesn't work. There is still more than 50 maps running as the job starts.

After looking through the hadoop document, I can't find another to limit the concurrent running map tasks. Hope someone can help me ,Thanks.

=====================

I hava found the answer about this question, here share to others who may be interested.

Using the fair scheduler, with configuration parameter maxMaps to set the a pool's maximum concurrent task slots, in the Allocation File (fair-scheduler.xml). Then when you submit jobs, just set the job's queue to the according pool.

Dave · Accepted Answer

You can set the value of mapred.jobtracker.maxtasks.per.job to something other than -1 (the default). This limits the number of simultaneous map or reduce tasks a job can employ.

This variable is described as:

The maximum number of tasks for a single job. A value of -1 indicates that there is no maximum.

I think there were plans to add mapred.max.maps.per.node and mapred.max.reduces.per.node to job configs, but they never made it to release.

Joel · Answer

If you are using Hadoop 2.7 or newer, you can use mapreduce.job.running.map.limit and mapreduce.job.running.reduce.limit to restrict map and reduce tasks at each job level.

Fix JIRA ticket.

how to restrict the concurrent running map tasks?

Tags:

map

task

jobs

hadoop

mapreduce

HaiWang

2 Answers

Dave

Joel

Recent Activity

Donate For Us

how to restrict the concurrent running map tasks?

Tags:

map

task

jobs

hadoop

mapreduce

HaiWang

2 Answers

Dave

Joel

Related questions

Recent Activity

Donate For Us