Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I debug an MPI program?

Tags:

debugging

mpi

I have an MPI program which compiles and runs, but I would like to step through it to make sure nothing bizarre is happening. Ideally, I would like a simple way to attach GDB to any particular process, but I'm not really sure whether that's possible or how to do it. An alternative would be having each process write debug output to a separate log file, but this doesn't really give the same freedom as a debugger.

Are there better approaches? How do you debug MPI programs?

like image 637
Jay Conrod Avatar asked Nov 30 '08 20:11

Jay Conrod


People also ask

How do I debug an MPI in Visual Studio?

If you right click on your project, select properties, select the debugging page, you can simply choose to use the MPI Cluster Debugger. If you don't have a professional version of visual studio, don't panic, you can still debug your applications.


2 Answers

I have found gdb quite useful. I use it as

mpirun -np <NP> xterm -e gdb ./program  

This the launches xterm windows in which I can do

run <arg1> <arg2> ... <argN> 

usually works fine

You can also package these commands together using:

mpirun -n <NP> xterm -hold -e gdb -ex run --args ./program [arg1] [arg2] [...] 
like image 131
messenjah Avatar answered Sep 23 '22 20:09

messenjah


As someone else said, TotalView is the standard for this. But it will cost you an arm and a leg.

The OpenMPI site has a great FAQ on MPI debugging. Item #6 in the FAQ describes how to attach GDB to MPI processes. Read the whole thing, there are some great tips.

If you find that you have far too many processes to keep track of, though, check out Stack Trace Analysis Tool (STAT). We use this at Livermore to collect stack traces from potentially hundreds of thousands of running processes and to represent them intelligently to users. It's not a full-featured debugger (a full-featured debugger would never scale to 208k cores), but it will tell you which groups of processes are doing the same thing. You can then step through a representative from each group in a standard debugger.

like image 36
Todd Gamblin Avatar answered Sep 21 '22 20:09

Todd Gamblin