How to modify MPI blocking send and receive to non-blocking

Tags:

I am trying to understand the difference between blocking and non-blocking message passing mechanisms in parallel processing using MPI. Suppose we have the following blocking code:

#include <stdio.h> 
#include <string.h> 
#include "mpi.h"

int main (int argc, char* argv[]) {
    const int maximum_message_length = 100;
    const int rank_0= 0;
    char message[maximum_message_length+1]; 
    MPI_Status status; /* Info about receive status */ 
    int my_rank; /* This process ID */
    int num_procs; /* Number of processes in run */ 
    int source; /* Process ID to receive from */
    int destination; /* Process ID to send to */
    int tag = 0; /* Message ID */

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); 
    MPI_Comm_size(MPI_COMM_WORLD, &num_procs);

    /* clients processes */
    if (my_rank != server_rank) {
        sprintf(message, "Hello world from process# %d", my_rank);
        MPI_Send(message, strlen(message) + 1, MPI_CHAR, rank_0, tag, MPI_COMM_WORLD);
    } else {    
    /* rank 0 process */ 
        for (source = 0; source < num_procs; source++) { 
            if (source != rank_0) {
                MPI_Recv(message, maximum_message_length + 1, MPI_CHAR, source, tag, 
                MPI_COMM_WORLD,&status);
                fprintf(stderr, "%s\n", message); 
            } 
        } 
    } 
         MPI_Finalize();
}

Each processor executes its task and send it back to rank_0 (the receiver). rank_0 will run a loop from 1 to n-1 processes and print them sequentially (i step in the loop may not proceed if the current client hasn't sent its task yet). How do I modify this code to achieve the non-blocking mechanism using MPI_Isend and MPI_Irecv? Do I need to remove the loop in receiver part (rank_0) and explicitly state MPI_Irecv(..) for each client, i.e.

MPI_Irecv(message, maximum_message_length + 1, MPI_CHAR, source, tag, 
                    MPI_COMM_WORLD,&status);

Thank you.

316

asked Sep 06 '15 23:09

Mike H.

1 Answers

What you do with non-blocking communication is to post the communication and then immediately proceed with your program to do other stuff, which again might be posting more communication. Especially, you can post all receives at once, and wait on them to complete only later on. This is what you typically would do in your scenario here.

Note however, that this specific setup is a bad example, as it basically just reimplements an MPI_Gather!

Here is how you typically would go about the non-blocking communication in your setup. First, you need some storage for all the messages to end up in, and also a list of request handles to keep track of the non-blocking communication requests, thus your first part of the code needs to be changed accordingly:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "mpi.h"

int main (int argc, char* argv[]) {
    const int maximum_message_length = 100;
    const int server_rank = 0;
    char message[maximum_message_length+1];
    char *allmessages;
    MPI_Status *status; /* Info about receive status */
    MPI_Request *req; /* Non-Blocking Requests */
    int my_rank; /* This process ID */
    int num_procs; /* Number of processes in run */
    int source; /* Process ID to receive from */
    int tag = 0; /* Message ID */

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
    MPI_Comm_size(MPI_COMM_WORLD, &num_procs);

    /* clients processes */
    if (my_rank != server_rank) {
        sprintf(message, "Hello world from process# %d", my_rank);
        MPI_Send(message, maximum_message_length + 1, MPI_CHAR, server_rank,
                 tag, MPI_COMM_WORLD);
    } else {

No need for non-blocking sends here. Now we go on and receive all these messages on server_rank. We need to loop over all of them and store a request handle for each of them:

    /* rank 0 process */
        allmessages = malloc((maximum_message_length+1)*num_procs);
        status = malloc(sizeof(MPI_Status)*num_procs);
        req = malloc(sizeof(MPI_Request)*num_procs);

        for (source = 0; source < num_procs; source++) {
            req[source] = MPI_REQUEST_NULL;
            if (source != server_rank) {
                /* Post non-blocking receive for source */
                MPI_Irecv(allmessages+(source*(maximum_message_length+1)),
                          maximum_message_length + 1, MPI_CHAR, source, tag,
                          MPI_COMM_WORLD, req+source);
                /* Proceed without waiting on the receive */
                /* (posting further receives */
            }
        }
        /* Wait on all communications to complete */
        MPI_Waitall(num_procs, req, status);
        /* Print the messages in order to the screen */
        for (source = 0; source < num_procs; source++) {
            if (source != server_rank) {
                fprintf(stderr, "%s\n",
                        allmessages+(source*(maximum_message_length+1)));
            }
        }
    }
    MPI_Finalize();
}

After posting the non-blocking receives, we need to wait on all of them to complete, to print the messages in the correct order. To do this, a MPI_Waitall is used, which allows us to block until all request handles are satisfied. Note, that I include the server_rank here for simplicity, but set its request to MPI_REQUEST_NULL initially, so it will be ignored. If you do not care about the order, you could process the communications as soon as they become available, by looping over the requests and employing MPI_Waitany. That would return as soon as any communication is completed and you could act on the corresponding data.

With MPI_Gather that code would look like this:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "mpi.h"

int main (int argc, char* argv[]) {
    const int maximum_message_length = 100;
    const int server_rank = 0;
    char message[maximum_message_length+1];
    char *allmessages;
    int my_rank; /* This process ID */
    int num_procs; /* Number of processes in run */
    int source; /* Process ID to receive from */
    int tag = 0; /* Message ID */

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
    MPI_Comm_size(MPI_COMM_WORLD, &num_procs);

    if (my_rank == server_rank) {
        allmessages = malloc((maximum_message_length+1)*num_procs);
    }
    sprintf(message, "Hello world from process# %d", my_rank);
    MPI_Gather(message, (maximum_message_length+1), MPI_CHAR,
               allmessages, (maximum_message_length+1), MPI_CHAR,
               server_rank, MPI_COMM_WORLD);

    if (my_rank == server_rank) {
        /* Print the messages in order to the screen */
        for (source = 0; source < num_procs; source++) {
            if (source != server_rank) {
                fprintf(stderr, "%s\n",
                        allmessages+(source*(maximum_message_length+1)));
            }
        }
    }
    MPI_Finalize();
}

And with MPI-3 you can even use a non-blocking MPI_Igather.

If you don't care about the ordering, the last part (starting with MPI_Waitall) could be done with MPI_Waitany like this:

    for (i = 0; i < num_procs-1; i++) {
        /* Wait on any next communication to complete */
        MPI_Waitany(num_procs, req, &source, status);
        fprintf(stderr, "%s\n",
                allmessages+(source*(maximum_message_length+1)));
    }

146

answered Sep 24 '22 18:09

haraldkl

Related questions
                            
                                How does "for ( ; *p; ++p) *p = tolower(*p);" work in c?
                            
                                Generate or find C headers for ICU core on OSX
                            
                                Erlang spawning large amounts of C processes
                            
                                Is it bad that LANG and LC_ALL are empty when running `locale -a` on OS X Yosemite?
                            
                                Efficient comparison of small integer vectors
                            
                                Are the C functions recvfrom and sendto mutually exclusive?
                            
                                Undefined symbols for architecture x86_64 (clang)
                            
                                struct of arrays and memory access patterns
                            
                                How to run shell commands in a C program [closed]
                            
                                How to find the "exit" of a C program
                            
                                Win32 API named pipe, All pipe instances are busy
                            
                                getchar() and buffer order
                            
                                Making a typedef struct public for local declaration, but keep the structure member access private to the module it is defined in
                            
                                Free a pointer from an external function
                            
                                C - Storing a large group of files as a single resource
                            
                                Is the char encoding same across programming languages?
                            
                                In C, can unused labels always be removed and keep a program working?
                            
                                Why this sin cos look up table inaccurate when radian is large?
                            
                                simple makefile for lex yacc and C
                            
                                storage size of ‘names’ isn’t known

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to modify MPI blocking send and receive to non-blocking

Tags:

c

parallel-processing

mpi

openmpi

message-passing

Mike H.

People also ask

1 Answers

haraldkl

Recent Activity

Donate For Us