Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenMPI MPI_Barrier problems

Tags:

c

mpi

openmpi

I having some synchronization issues using the OpenMPI implementation of MPI_Barrier:

int rank;
int nprocs;

int rc = MPI_Init(&argc, &argv);

if(rc != MPI_SUCCESS) {
    fprintf(stderr, "Unable to set up MPI");
    MPI_Abort(MPI_COMM_WORLD, rc);
}

MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);


printf("P%d\n", rank);
fflush(stdout);

MPI_Barrier(MPI_COMM_WORLD);

printf("P%d again\n", rank);

MPI_Finalize();

for mpirun -n 2 ./a.out

output should be: P0 P1 ...

output is sometimes: P0 P0 again P1 P1 again

what's going on?

like image 412
hola Avatar asked Mar 03 '11 14:03

hola


3 Answers

The order in which your print out lines appear on your terminal is not necessarily the order in which things are printed. You are using a shared resource (stdout) for that so there always must be an ordering problem. (And fflush doesn't help here, stdout is line buffered anyhow.)

You could try to prefix your output with a timestamp and save all of this to different files, one per MPI process.

Then to inspect your log you could merge the two files together and sort according to the timestamp.

Your problem should disappear, then.

like image 195
Jens Gustedt Avatar answered Oct 20 '22 01:10

Jens Gustedt


Output ordering is not guaranteed in MPI programs.

This is not related to MPI_Barrier at all.

Also, I would not spend too much time on worrying about output ordering with MPI programs.

The most elegant way to achieve this, if you really want to, is to let the processes send their messages to one rank, say, rank 0, and let rank 0 print the output in the order it received them or ordered by ranks.

Again, dont spend too much time on trying to order the output from MPI programs. It is not practical and is of little use.

like image 28
powerrox Avatar answered Oct 20 '22 01:10

powerrox


There is nothing wrong with MPI_Barrier().

As Jens mentioned, the reason why you are not seeing the output you expected is because stdout is buffered on each processes. There is no guarantee that prints from multiple processes will be displayed on the calling process in order. (If stdout from each process is be transferred to the main process for printing in real time, that will lead to lots of unnecessary communication!)

If you want to convince yourself that the barrier works, you could try writing to a file instead. Having multiple processes writing to a single file may lead to extra complications, so you could have each proc writing to one file, then after the barrier, swap the files they write to. For example:

    Proc-0           Proc-1
      |                 |
 f0.write(..)     f1.write(...) 
      |                 |
      x  ~~ barrier ~~  x
      |                 |
 f1.write(..)     f0.write(...) 
      |                 |
     END               END

Sample implementation:

#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
    char filename[20];
    int rank, size;
    FILE *fp;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    if (rank < 2) { /* proc 0 and 1 only */ 
        sprintf(filename, "file_%d.out", rank);
        fp = fopen(filename, "w");
        fprintf(fp, "P%d: before Barrier\n", rank);
        fclose(fp);
    }

    MPI_Barrier(MPI_COMM_WORLD);

    if (rank < 2) { /* proc 0 and 1 only */ 
        sprintf(filename, "file_%d.out", (rank==0)?1:0 );
        fp = fopen(filename, "a");
        fprintf(fp, "P%d: after Barrier\n", rank);
        fclose(fp);
    }

    MPI_Finalize();
    return 0;

}

After running the code, you should get the following results:

[me@home]$ cat file_0.out
P0: before Barrier
P1: after Barrier

[me@home]$ cat file_1.out
P1: before Barrier
P0: after Barrier

For all files, the "after Barrier" statements will always appear later.

like image 20
Shawn Chin Avatar answered Oct 20 '22 00:10

Shawn Chin