Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MPI Spawn: root process does not communicate to child processes

Tags:

c

mpi

(Beginner question) I'm trying to spawn processes dynamically using MPI_Comm_Spawn and then broadcast a message to the child processes, but the program stops in the broadcast from the root process to the children. I'm following the documentation from http://www.mpi-forum.org/docs/docs.html but i can't make it work. Can anybody help me please?

#include <stdio.h>
#include <mpi.h>

int main(int argc, char *argv[])
{
    MPI_Init(&argc, &argv);
    MPI_Comm parentcomm;

    MPI_Comm_get_parent( &parentcomm );

    if (parentcomm == MPI_COMM_NULL) {
        MPI_Comm intercomm;
        MPI_Status status;
        char msg_rec[1024];
        char msg_send[1024];
        int size, i;

        int np = (argc > 0) ? atoi(argv[1]) : 3;

        printf("Spawner will spawn %d processes\n", np);
        MPI_Comm_spawn( argv[0], MPI_ARGV_NULL, np, MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE );
        MPI_Comm_size(intercomm, &size);

        sprintf(msg_send, "Hello!");
        printf("Spawner will broadcast '%s'\n", msg_send);
        MPI_Bcast( (void*)msg_send, 1024, MPI_CHAR, 0, intercomm);

        printf("Spawner will receive answers\n");
        for (i=0; i < size; i++) {
            MPI_Recv( (void*)msg_rec, 1024, MPI_CHAR, i, MPI_ANY_TAG, intercomm, &status);
            printf("Spawner received '%s' from rank %d\n", msg_rec, i);
        };       

    } else {
        int rank, size;
        char msg_rec[1024];
        char msg_send[1024];

        MPI_Comm_rank(parentcomm, &rank);
        MPI_Comm_size(parentcomm, &size);

        printf("  Rank %d ready\n", rank);

        MPI_Bcast( (void*)msg_rec, 1024, MPI_CHAR, 0, parentcomm);

        printf("  Rank %d received '%s' from broadcast!\n", rank, msg_rec);
        sprintf(msg_send, "Hi there from rank %d!\n", rank);
        MPI_Send( (void*)msg_send, 1024, MPI_CHAR, 0, rank, parentcomm);
    };
    MPI_Finalize();
    return 0;
};

I don't know if it matters, but I'm using ubuntu 11.10 and Hidra Process Manager.

like image 334
Flamínio Maranhão Avatar asked Apr 02 '12 03:04

Flamínio Maranhão


2 Answers

As @suszterpatt pointed out, your are working with an "Intercommunicator" (not a "Intracommunicator"). Knowing this and looking at MPI_Bcast, we see:

If comm is an intercommunicator, then the call involves all processes in the intercommunicator, but with one group (group A) defining the root process. All processes in the other group (group B) pass the same value in argument root, which is the rank of the root in group A. The root passes the value MPI_ROOT in root. All other processes in group A pass the value MPI_PROC_NULL in root. Data is broadcast from the root to all processes in group B. The receive buffer arguments of the processes in group B must be consistent with the send buffer argument of the root.

This means that you need only replace the broadcast call in the parent with:

MPI_Bcast( (void*)msg_send, 1024, MPI_CHAR, MPI_ROOT, intercomm);

A few other bugs:

  • The check on the number of arguments should be argc > 1.
  • MPI_Comm_size(intercomm, &size) will return 1. You'll want to use MPI_Comm_remote_size(intercomm, &size) instead.
like image 57
bfroehle Avatar answered Oct 20 '22 01:10

bfroehle


If you don't want to deal with an intercommunicator after you've spawned your child processes, you can use MPI_Intercomm_merge to create an intracommunicator from your intercommunicator. Essentially, it would look like this:

Spawner:

MPI_Comm_spawn( argv[0], MPI_ARGV_NULL, np, MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE );
MPI_Intercomm_merge(intercomm, 0, &intracomm);

Spawnee:

MPI_Intercomm_merge(parentcomm, 1, &intracomm);

After that, you can continue to use intracomm (or whatever you want to call it) as if it were a regular intracommunicator. In this instance, the spawning processes would have the low order ranks and the new processes would have the higher ranks, but you can modify that as well with the second argument.

like image 20
Wesley Bland Avatar answered Oct 19 '22 23:10

Wesley Bland