Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MPI_ERR_TRUNCATE: On Broadcast

I have an int I intend to broadcast from root (rank==(FIELD=0)).

int winner

if (rank == FIELD) {
    winner = something;
}

MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(&winner, 1, MPI_INT, FIELD, MPI_COMM_WORLD);
MPI_Barrier(MPI_COMM_WORLD);
if (rank != FIELD) {
    cout << rank << " informed that winner is " << winner << endl;
}

But it appears I get

[JM:6892] *** An error occurred in MPI_Bcast
[JM:6892] *** on communicator MPI_COMM_WORLD
[JM:6892] *** MPI_ERR_TRUNCATE: message truncated
[JM:6892] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort

Found that I can increase the buffer size in Bcast

MPI_Bcast(&winner, NUMPROCS, MPI_INT, FIELD, MPI_COMM_WORLD);

Where NUMPROCS is number of running processes. (actually seems like I just need it to be 2). Then it runs, but gives unexpected output ...

1 informed that winner is 103
2 informed that winner is 103
3 informed that winner is 103
5 informed that winner is 103
4 informed that winner is 103

When I cout the winner, it should be -1

like image 819
Jiew Meng Avatar asked Nov 08 '12 14:11

Jiew Meng


People also ask

Which of the following is broadcast communication in MPI?

Broadcasting with MPI_Bcast During a broadcast, one process sends the same data to all processes in a communicator. One of the main uses of broadcasting is to send out user input to a parallel program, or send out configuration parameters to all processes.

What does MPI Bcast do?

MPI_Bcast broadcasts a message from the process with rank root to all processes of the group, itself included. It is called by all members of group using the same arguments for comm, root. On return, the contents of root's communication buffer has been copied to all processes.


1 Answers

There is an error early in your code:

if (rank == FIELD) {
   // randomly place ball, then broadcast to players
   ballPos[0] = rand() % 128;
   ballPos[1] = rand() % 64;
   MPI_Bcast(ballPos, 2, MPI_INT, FIELD, MPI_COMM_WORLD);
}

This is a very common mistake. MPI_Bcast is a collective operation and it must be called by all processes in order to complete. What happens in your case is that this broadcast is not called by all processes in MPI_COMM_WORLD (but only by the root) and hence interferes with the next broadcast operation, namely the one inside the loop. The second broadcast operation actually receives messages sent by the first one (two int elements) into a buffer for just one int and hence the truncation error message. In Open MPI each broadcast uses internally the same message tag values and hence different broadcasts can interfere with each other in not issued in sequence. This is compliant with the (old) MPI standard - one cannot have more than one outstanding collective operations in MPI-2.2 (in MPI-3.0 one can have several outstanding non-blocking collective operations). You should rewrite the code as:

if (rank == FIELD) {
   // randomly place ball, then broadcast to players
   ballPos[0] = rand() % 128;
   ballPos[1] = rand() % 64;
}
MPI_Bcast(ballPos, 2, MPI_INT, FIELD, MPI_COMM_WORLD);
like image 128
Hristo Iliev Avatar answered Oct 02 '22 23:10

Hristo Iliev