I have to implement MPI system in a cluster. If anyone here has any experience with MPI (MPICH/OpenMPI), I'd like to know which is better and how the performance can be boosted on a cluster of x86_64 boxes.
OpenMPI comes out of the box on Macbooks, and MPICH seems to be more Linux/Valgrind friendly. It is between you and your toolchain. If it is a production cluster you need to do more extensive benchmarking to make sure it is optimized to your network topology.
MPICH, formerly known as MPICH2, is a freely available, portable implementation of MPI, a standard for message-passing for distributed-memory applications used in parallel computing.
It is used by many TOP500 supercomputers including Roadrunner, which was the world's fastest supercomputer from June 2008 to November 2009, and K computer, the fastest supercomputer from June 2011 to June 2012.
On distributed parallel systems, like Linux clusters, the Message Passing Interface (MPI) is widely used. MPI is not a programming language, but rather a standard library that is used to send messages between multiple processes.
MPICH has been around a lot longer. It's extremely portable and you'll find years worth of tips and tricks online. It's a safe bet and it's probably compatible with more MPI programs out there.
OpenMPI is newer. While it's not quite as portable, it supports the most common platforms really well. Most people seem to think it's a lot better in several regards, especially for fault-tolerance - but to take advantage of this you may have to use some of its special features that aren't part of the MPI standard.
As for performance, it depends a lot on the application; it's hard to give general advice. You should post a specific question about the type of calculation you want to run, the number of nodes, and the type of hardware - including what type of network hardware you're using.
I've written quite a few parallel applications for both Windows and Linux clusters, and I can advise you that right now MPICH2 is probably the safer choice. It is, as the other responder mentions, a very mature library. Also, there is ample broadcasting support (via MPI_Bcast) now, and in fact, MPICH2 has quite a few really nice features like scatter-and-gather.
OpenMPI is gaining some ground though. Penguin computing (they're a big cluster vendor, and they like Linux) actually has some really strong benchmarks where OpenMPI beats MPICH2 hands down in certain circumstances.
Regarding your comment about "boosting performance", the best piece of advice I can give is to never send more data than absolutely necessary if you're I/O bound, and never do more work than necessary if you're CPU bound. I've fallen into the trap of optimizing the wrong piece of code more than once :) Hopefully you won't follow in my footsteps!
Check out the MPI forums - they have a lot of good info about MPI routines, and the Beowulf site has a lot of interesting questions answered.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With