How to schedule Hadoop Map tasks in multi-core 8 node cluster?

Tags:

I have a "map only" (no reduce phase) program. The size of input file is large enough to create 7 map tasks and I have verified that by looking the output produced (part-000 to part006) . Now, my cluster has 8 nodes each with 8 cores and 8 GB of memory and shared filesystem hosted at head node.

My question is can I choose between running all the 7 map tasks in 1 node only or running the 7 map tasks in 7 different slave nodes (1 task per node). If I can do so, then what change in my code and configuration file is needed.

I tried setting the parameter "mapred.tasktracker.map.tasks.maximum" to 1 and 7 in my code only but I didnot find any appreciable time difference. In my configuration file it is set as 1.

280

asked Apr 29 '12 15:04

justin waugh

1 Answers

"mapred.tasktracker.map.tasks.maximum" deals with the number of map tasks that should be launched on each node, not the number of nodes to be used for each map task. In the Hadoop architecture, there is 1 tasktracker for each node (slaves) and 1 job tracker on a master node (master). So if you set the property mapred.tasktracker.map.tasks.maximum, it will only change the number of map tasks to be executed per node. The range of "mapred.tasktracker.map.tasks.maximum" is from 1/2*cores/node to 2*cores/node

The number of map tasks that you want overall should be set using setNumMapTasks(int)

175

answered Nov 01 '22 15:11

Chaos

Related questions
                            
                                Does Spark Supports With Clause?
                            
                                Spark job failing due to space issue
                            
                                Replacing Bad Node in Zookeeper Quorum Safely
                            
                                Spark Structured Streaming Blue/Green Deployments
                            
                                Hadoop: datanode not starting on slave
                            
                                Change spark _temporary directory path
                            
                                Hadoop application development, and PHP
                            
                                running multiple MapReduce jobs in hadoop
                            
                                Managing hdfs in pseudo-distributed hadoop mode
                            
                                What's the best way to support array column types with external tables in hive?
                            
                                Where do I download all of the necessary classes to write Hadoop MapReduce jobs? [closed]
                            
                                Hadoop - increasing map tasks in xml doesn't increases map tasks when runs
                            
                                how to read a file from HDFS through browser
                            
                                file formats that can be read using PIG
                            
                                What is the correct way to use oozie to write to multiple output streams for a mapreduce job?
                            
                                Hadoop: Easy way to have object as output value without Writable interface
                            
                                Hive: parsing JSON
                            
                                MapReduce job hangs, waiting for AM container to be allocated
                            
                                Renaming Part Files in Hadoop Map Reduce
                            
                                Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to schedule Hadoop Map tasks in multi-core 8 node cluster?

Tags:

hadoop

mapreduce

cloudera

justin waugh

People also ask

1 Answers

Chaos

Recent Activity

Donate For Us