Can reducers and mappers be on the same data node?

Question

I have started reading about Big Data and Hadoop, so this question may sound very stupid to you.

This is what I know.

Each mapper processes a small amount of data and produces an intermediate output. After this, we have the step of shuffle and sort.

Now, Shuffle = Moving intermediate output over to respective Reducers each dealing with a particular key/keys.

So, can one Data Node have the Mapper and Reducer code running in them or we have different DNs for each?

Simplefish · Accepted Answer

Terminology: Datanodes are for HDFS (storage). Mappers and Reducers (compute) run on nodes that have the TaskTracker daemon on them.
The number of mappers and reducers per tasktracker are controlled by the configs: mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum

Subject to other limits in other configs, theoretically, as long as the tasktracker doesn't have the maximum number of map or reduce tasks, it may get assigned more map or reduce tasks by the jobtracker. Typically the jobtracker will try to assign tasks to reduce the amount of data movement.

So, yes, you can have mappers and reducers running on the same node at the same time.

Can reducers and mappers be on the same data node?

Tags:

hadoop

reducers

mapper

user2441151

1 Answers

Simplefish

Recent Activity

Donate For Us

Can reducers and mappers be on the same data node?

Tags:

hadoop

reducers

mapper

user2441151

1 Answers

Simplefish

Related questions

Recent Activity

Donate For Us