How are MPI processes started?

Tags:

mpi

When starting an MPI job with mpirun or mpiexec, I can understand how one might go about starting each individual process. However, without any compiler magic, how do these wrapper executables communicate the arrangement (MPI communicator) to the MPI processes?

I am interested in the details, or a pointer on where to look.

210

asked Jun 06 '12 10:06

Bojan B

1 Answers

Details on how individual processes establish the MPI universe are implementation specific. You should look into the source code of the specific library in order to understand how it works. There are two almost universal approaches though:

command line arguments: the MPI launcher can pass arguments to the spawned processes indicating how and where to connect in order to establish the universe. That's why MPI has to be initialised by calling MPI_Init() with argc and argv in C - thus the library can get access to the command line and extract all arguments that are meant for it;
environment variables: the MPI launcher can set specific environment variables whose content can indicate where and how to connect.

Open MPI for example sets environment variables and also writes some universe state in a disk location known to all processes that run on the same node. You can easily see the special variables that its run-time component ORTE (OpenMPI Run-Time Environment) uses by executing a command like mpirun -np 1 printenv:

$ mpiexec -np 1 printenv | grep OMPI
... <many more> ...
OMPI_MCA_orte_hnp_uri=1660944384.0;tcp://x.y.z.t:43276;tcp://p.q.r.f:43276
OMPI_MCA_orte_local_daemon_uri=1660944384.1;tcp://x.y.z.t:36541
... <many more> ...

(IPs changed for security reasons)

Once a child process is launched remotely and MPI_Init() or MPI_Init_thread() is called, ORTE kicks in and reads those environment variables. Then it connects back to the specified network address with the "home" mpirun/mpiexec process which then coordinates all spawned processes into establishing the MPI universe.

Other MPI implementations work in a similar fashion.

134

answered Sep 22 '22 02:09

Hristo Iliev

Related questions
                            
                                MPI mpirun execvp error: no such file or directory
                            
                                Sending columns of a matrix using MPI_Scatter
                            
                                Terminating all processes with MPI
                            
                                Is the behavior of MPI communication of a rank with itself well-defined?
                            
                                How to share work roughly evenly between processes in MPI despite the array_size not being cleanly divisible by the number of processes?
                            
                                Is it possible to send data from a Fortran program to Python using MPI?
                            
                                MPI - error loading shared libraries
                            
                                Difference between running a program with and without mpirun
                            
                                In place mpi_reduce crashes with OpenMPI
                            
                                Having Open MPI related issues while making CUDA 5.0 samples (Mac OS X ML)
                            
                                Deadlock with MPI
                            
                                MPI_Isend request parameter
                            
                                Unable to use all cores with mpirun
                            
                                Syntax of the --map-by option in openmpi mpirun v1.8
                            
                                Using valgrind to spot error in mpi code
                            
                                Parallel Algorithms for Generating Prime Numbers (possibly using Hadoop's map reduce)
                            
                                Segmentation faults occur when I run a parallel program with Open MPI
                            
                                MPI partition matrix into blocks
                            
                                MPI serial main function

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With