Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Specify the machines running program using MPI

Tags:

mpi

I am going to do some parallel computing and I'm totally a beginner in this area. I will use MPI to do the parallel work, with Master-Slave model. I now have four machines and want one of them to be the Master Node. However, I don't know how to specify the other machines running the program. Is there a way like specifying the IP address of slave node? How to launch my program? I'm using Ubuntu 12.10.

like image 381
Yan Li Avatar asked Apr 09 '13 13:04

Yan Li


People also ask

What is MPI on a single machine?

MPI – Message Passing Interface. ▶ MPI is used for distributed memory parallelism (communication. between nodes of a cluster) ▶ Interface specification with many implementations.

What is Hostfile in MPI?

Hostfiles my_hostfile are simple text files with hosts specified, one per line. Each host can also specify a default and maximum number of slots to be used on that host (i.e., the number of available processors on that host). Comments are also supported, and blank lines are ignored.


1 Answers

Setup

Make sure you have the same directory/file structure on every node. E.g., the executable should be /home/yan/my_program on every computer. You can e.g. mount the same directory on every computer via NFS.

Setup SSH so that you can login on every slave node from the master node like this:

yan@master:~/$ ssh slave1
yan@slave1:~/$

This means that the user yan has to exist on every computer. If you setup login via SSH key, you don't have to enter the password. If you have login via password, you have to enter it when starting the program.

Install OpenMPI using

sudo apt-get install penmpi-bin openmpi-doc libopenmpi-dev

You can install an other MPI implementation like MPICH instead.

Run program

Now, compile your program with mpicc myprogram.c -o myprogram (if you are using C; for C++, mpic++, etc.) and run it using

yan@masternode:~/$ mpirun -n 4 -H master,slave1,slave2,slave3 myprogram

Instead of the machine name, you can also use an IP address. -n specifies the number of processes. If you omit the option, one process will be started on each machine. You can also use several slots per machine:

yan@masternode:~/$ mpirun -n 8 -H master,slave1,slave2,slave3,\
master,slave1,slave2,slave3 myprogram

Alternatively, you can write one computer name per line into a HOSTFILE and specify it like this:

yan@masternode:~/$ mpirun -hostfile HOSTFILE

These commands automatically connect to the slave computers via SSH, start the program and set the communication parameters so that the data distribution works automatically and MPI_Comm_size and MPI_Comm_rank give the number of the current computer and the size of the cluster.

You can see those options by invoking man mpirun.

like image 154
zonksoft Avatar answered Sep 25 '22 20:09

zonksoft