Hadoop and map-reduce on multicore machines

Tags:

I have read a lot about Hadoop and Map-Reduce running on clusters of machines. Does some one know if the Apache distribution can be run on an SMP with several cores. In particular, can multiple Map-Reduce processes be run on the same machine. The scheduler will take care of spreading them across multiple cores. Thanks. - KG

992

asked Sep 29 '12 23:09

K Gupta

1 Answers

Yes. You have multiple map and reduce slots in each machine which are determined by the RAM and CPU (each JVM instance needs 1GB by default so a 8GB machine with 16 cores should still have 7 task slots)

from hadoop wiki

Use the configuration knob: mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum to control the number of maps/reduces spawned simultaneously on a TaskTracker. By default, it is set to 2, hence one sees a maximum of 2 maps and 2 reduces at a given instance on a TaskTracker.

You can set those on a per-tasktracker basis to accurately reflect your hardware (i.e. set those to higher nos. on a beefier tasktracker etc.).

answered Nov 15 '22 07:11

Arnon Rotem-Gal-Oz

Related questions
                            
                                YarnApplicationState: ACCEPTED: waiting for AM container to be allocated, launched and register
                            
                                spark on yarn, Connecting to ResourceManager at /0.0.0.0:8032
                            
                                Spark job reading from S3 on Spark cluster gives IllegalAccessError: tried to access method MutableCounterLong [duplicate]
                            
                                How to write TIMESTAMP logical type (INT96) to parquet, using ParquetWriter?
                            
                                What is the difference between Driver and Application manager in spark
                            
                                Advanced queries in HBase
                            
                                Setting fs.default.name in core-site.xml Sets HDFS to Safemode
                            
                                Can't run a MapReduce job on hadoop 2.4.0
                            
                                Spark can no longer execute jobs. Executors fail to create directory
                            
                                Hive Runtime Error while processing row in Hive
                            
                                How to flatMap a function on GroupedDataSet in Apache Flink
                            
                                Hive clustered by on more than one column
                            
                                Hive collect_list() does not collect NULL values
                            
                                Spark Exception : Task failed while writing rows
                            
                                HBase connection exception
                            
                                Hadoop mapreduce : Driver for chaining mappers within a MapReduce job
                            
                                How does HBase guarantee row level atomicity?
                            
                                How to produce massive amount of data?
                            
                                Differences between hflush & hsync api's in HDFS
                            
                                Hadoop - Writing to HBase directly from the Mapper

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Hadoop and map-reduce on multicore machines

Tags:

hadoop

multicore

K Gupta

People also ask

1 Answers

Arnon Rotem-Gal-Oz

Recent Activity

Donate For Us