Why map and reduce run at the same time?

Question

I am newbie on Hadoop. I remember I learned from somewhere that in Hadoop, all map functions have to be completed before reduce functions can start off.

But I just got the printout when I run a map reduce program like this:

map(15%), reduce(5%)
map(20%), reduce(7%)
map(30%), reduce(10%)
map(38%), reduce(17%)
map(40%), reduce(25%)

why they run in parallel?

Tariq · Accepted Answer

Before actual Reduce phase starts, Shuffle, Sort and Merge take place as Mappers keep on completing. This percentage signifies that. It is not the actual Reduce phase. This happens in parallel to reduce the overhead which would otherwise be incurred if framework keeps on waiting for completion of all the Mappers first and then do the Shuffling, Sorting and Merging.

Why map and reduce run at the same time?

Tags:

hadoop

mapreduce

gywlily

1 Answers

Tariq

Recent Activity

Donate For Us

Why map and reduce run at the same time?

Tags:

hadoop

mapreduce

gywlily

1 Answers

Tariq

Related questions

Recent Activity

Donate For Us