Map Reduce Keep input ordering

Question

I tried to implement an application using hadoop which processes text files.The problem is that I cannot keep the ordering of the input text.Is there any way to choose the hash function?This problem could be easily solved by assigning a partition of the input to each mapper an then send the partition to the reducers.Is this possible with hadoop ?

Niels Basjes · Accepted Answer

The base idea of MapReduce is that the order in which things are done is irrelevant. So you cannot (and do not need to) control the order in which:

the input records go through the mappers.
the key and related values go through the reducers.

The only thing you can control is the order in which the values are placed in the iterator that is made available in the reducer. This is done using a construct called "secondary sort".

A simple google action for this term resulted in several points where you can continue. I like this blog post : link

Map Reduce Keep input ordering

Tags:

hadoop

mapreduce

nikosdi

1 Answers

Niels Basjes

Recent Activity

Donate For Us

Map Reduce Keep input ordering

Tags:

hadoop

mapreduce

nikosdi

1 Answers

Niels Basjes

Related questions

Recent Activity

Donate For Us