MPI vs GPU vs Hadoop, what are the major difference between these three parallelism?

Tags:

I know for some machine learning algorithm like random forest, which are by nature should be implemented in parallel. I do a home work and find there are these three parallel programming framework, so I am interested in knowing what are the major difference between these three types of parallelism?

Especially, if some one can point me to some study compare the difference between them, that will be perfect!

Please list the pros and cons for each parallelism , thanks

298

asked Apr 19 '12 21:04

user974270

1 Answers

MPI is a message passing paradigm of parallelism. Here, you have a root machine which spawns programs on all the machines in its MPI WORLD. All the threads in the system are independent and hence the only way of communication between them is through messages over network. The network bandwidth and throughput is one of the most crucial factor in MPI implementation's performance. Idea : If there is just one thread per machine and you have many cores on it, you can use OpenMP shared memory paradigm for solving subsets of your problem on one machine.
CUDA is a SMT paradigm of parallelism. It uses state of the art GPU architecture to provide parallelisim. A GPU contains (blocks of ( set of cores)) working on same instruction in a lock-step fashion (This is similar to SIMD model). Hence, if all the threads in your system do a lot of same work, you can use CUDA. But the amount of shared memory and global memory in a GPU are limited and hence you should not use just one GPU for solving a huge problem.
Hadoop is used for solving large problems on commodity hardware using Map Reduce paradigm. Hence, you do not have to worry about distributing data or managing corner cases. Hadoop also provides a file system HDFS for storing data on compute nodes.

Hadoop, MPI and CUDA are completely orthogonal to each other. Hence, it may not be fair to compare them.

Though, you can always use ( CUDA + MPI ) to solve a problem using a cluster of GPU's. You still need a simple core to perform the communication part of the problem.

119

answered Dec 17 '22 03:12

prathmesh.kallurkar

Related questions
                            
                                Why do I have to keep using `source ~/.profile` to get settings in place?
                            
                                Boost Program Options Add Options Syntax
                            
                                Why am I forced to os.path.expanduser in python?
                            
                                Formatting output so that Intellij Idea shows diffs for two texts
                            
                                How to trace Makefile targets for troubleshooting?
                            
                                How to run gevent in production
                            
                                How exactly does "Visual Studio Version Selector" chooses a Visual Studio version?
                            
                                R ggplot barplot; Fill based on two separate variables
                            
                                why i can't cast nullptr to weak_ptr<>
                            
                                What is the default timezone in java.util.Date
                            
                                Don't understand closest pair heuristic from "The Algorithm Design Manual "
                            
                                How do I import and export keyboard bindings in Eclipse? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With