Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best MPI implementation [closed]

Tags:

I have to implement MPI system in a cluster. If anyone here has any experience with MPI (MPICH/OpenMPI), I'd like to know which is better and how the performance can be boosted on a cluster of x86_64 boxes.

like image 762
prasanna Avatar asked Sep 27 '08 19:09

prasanna


People also ask

What is the difference between OpenMPI and Mpich?

OpenMPI comes out of the box on Macbooks, and MPICH seems to be more Linux/Valgrind friendly. It is between you and your toolchain. If it is a production cluster you need to do more extensive benchmarking to make sure it is optimized to your network topology.

What is Mpich package?

MPICH, formerly known as MPICH2, is a freely available, portable implementation of MPI, a standard for message-passing for distributed-memory applications used in parallel computing.

Where is Open MPI used?

It is used by many TOP500 supercomputers including Roadrunner, which was the world's fastest supercomputer from June 2008 to November 2009, and K computer, the fastest supercomputer from June 2011 to June 2012.

What is Linux MPI?

On distributed parallel systems, like Linux clusters, the Message Passing Interface (MPI) is widely used. MPI is not a programming language, but rather a standard library that is used to send messages between multiple processes.


2 Answers

MPICH has been around a lot longer. It's extremely portable and you'll find years worth of tips and tricks online. It's a safe bet and it's probably compatible with more MPI programs out there.

OpenMPI is newer. While it's not quite as portable, it supports the most common platforms really well. Most people seem to think it's a lot better in several regards, especially for fault-tolerance - but to take advantage of this you may have to use some of its special features that aren't part of the MPI standard.

As for performance, it depends a lot on the application; it's hard to give general advice. You should post a specific question about the type of calculation you want to run, the number of nodes, and the type of hardware - including what type of network hardware you're using.

like image 151
dmazzoni Avatar answered Sep 17 '22 16:09

dmazzoni


I've written quite a few parallel applications for both Windows and Linux clusters, and I can advise you that right now MPICH2 is probably the safer choice. It is, as the other responder mentions, a very mature library. Also, there is ample broadcasting support (via MPI_Bcast) now, and in fact, MPICH2 has quite a few really nice features like scatter-and-gather.

OpenMPI is gaining some ground though. Penguin computing (they're a big cluster vendor, and they like Linux) actually has some really strong benchmarks where OpenMPI beats MPICH2 hands down in certain circumstances.

Regarding your comment about "boosting performance", the best piece of advice I can give is to never send more data than absolutely necessary if you're I/O bound, and never do more work than necessary if you're CPU bound. I've fallen into the trap of optimizing the wrong piece of code more than once :) Hopefully you won't follow in my footsteps!

Check out the MPI forums - they have a lot of good info about MPI routines, and the Beowulf site has a lot of interesting questions answered.

like image 31
Mike Avatar answered Sep 19 '22 16:09

Mike