Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are some scenarios for which MPI is a better fit than MapReduce?

As far as I understand, MPI gives me much more control over how exactly different nodes in the cluster will communicate.

In MapReduce/Hadoop, each node does some computation, exchanges data with other nodes, and then collates its partition of results. Seems simple, but since you can iterate the process, even algorithms like K-means or PageRank fit the model quite well. On a distributed file system with locality of scheduling, the performance is apparently good. In comparison, MPI gives me explicit control over how nodes send messages to each other.

Can anyone describe a cluster programming scenario where the more general MPI model is an obvious advantage over the simpler MapReduce model?

like image 633
Igor ostrovsky Avatar asked Oct 07 '09 09:10

Igor ostrovsky


1 Answers

Almost any scientific code -- finite differences, finite elements, etc. Which kind of leads to the circular answer, that any distributed program which doesn't easily map to MapReduce would be better implemented with a more general MPI model. Not sure that's much help to you, I'll downvote this answer right after I post it.

like image 167
High Performance Mark Avatar answered Oct 04 '22 20:10

High Performance Mark