Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write an MPI wrapper for dynamic loading

Since MPI doesn't offer binary compatibility, only source compatibility, we're forced to ship our solver source code to customers for them to use our solver with their preferred version of MPI. Well, we reached the point where we cannot offer source code anymore.

As a result, I'm looking into ways to create a wrapper around MPI calls. The idea is for us to provide a header of stub functions, and the user would write the implementation, create a dynamic library out of it, and then our solver would load it at runtime.

But solutions aren't "elegant" and are prone to errors. Because there are struct arguments (say, MPI_Request) whose struct definitions may differ from one MPI implementation to another, we need to accept (void*) for many of our stub arguments. Also, if the number of arguments can differ from one MPI to another (which I'm not sure if it's guaranteed to not happen, ever) than the only way around that is using var_args.

//header (provided by us)
int my_stub_mpi_send(const void buf, int count, void* datatype,
        int dest, int tag, void* comm);

//*.c (provided by user)
#include <my_stub_mpi.h>
#include <mpi.h>
int my_stub_mpi_send(const void buf, int count, void* datatype,
        int dest, int tag, void* comm)
{
    return MPI_Send(buf, count, *((MPI_Datatype) datatype),
            dest, tag, ((MPI_Comm) comm));
}
//Notes: (1) Most likely the interface will be C, not C++,
//           unless I can make a convincing case for C++;
//       (2) The goal here is to avoid *void pointers, if possible;

My question is if anyone knows of a solution around those issues?

like image 803
blue scorpion Avatar asked Jul 18 '16 16:07

blue scorpion


1 Answers

If you are only targeting platforms that support the PMPI profiling interface, then there is a generic solution that requires minimal to no changes in the original source code. The basic idea is to (ab-)use the PMPI interface for the wrapper. It is probably in some non-OO sense an implementation of the bridge pattern.

First, several observations. There is a single structure type defined in the MPI standard and that is MPI_Status. It has only three publicly visible fields: MPI_SOURCE, MPI_TAG, and MPI_ERR. No MPI function takes MPI_Status by value. The standard defines the following opaque types: MPI_Aint, MPI_Count, MPI_Offset, and MPI_Status (+ several Fortran interoperability types hereby dropped for clarity). The first three are integral. Then there are 10 handle types, from MPI_Comm to MPI_Win. Handles can be implemented either as special integer values or as pointers to internal data structures. MPICH and other implementations based on it take the first approach while Open MPI takes the second one. Being either a pointer or an integer, a handle of any kind can fit within a single C datatype, namely intptr_t.

The basic idea is to override all MPI functions and redefine their arguments to be of an intptr_t type, then have the user-compiled code do the transition to the proper type and make the actual MPI call:

In mytypes.h:

typedef intptr_t my_MPI_Datatype;
typedef intptr_t my_MPI_Comm;

In mympi.h:

#include "mytypes.h"

// Redefine all MPI handle types
#define MPI_Datatype my_MPI_Datatype
#define MPI_Comm     my_MPI_Comm

// Those hold the actual values of some MPI constants
extern MPI_Comm     my_MPI_COMM_WORLD;
extern MPI_Datatype my_MPI_INT;

// Redefine the MPI constants to use our symbols
#define MPI_COMM_WORLD my_MPI_COMM_WORLD
#define MPI_INT        my_MPI_INT

// Redeclare the MPI interface
extern int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm);

In mpiwrap.c:

#include <mpi.h>
#include "mytypes.h"

my_MPI_Comm my_MPI_COMM_WORLD;
my_MPI_Datatype my_MPI_INT;

int MPI_Init(int *argc, char ***argv)
{
   // Initialise the actual MPI implementation
   int res = PMPI_Init(argc, argv);
   my_MPI_COMM_WORLD = (intptr_t)MPI_COMM_WORLD;
   my_MPI_INT = (intptr_t)MPI_INT;
   return res;
}

int MPI_Send(void *buf, int count, intptr_t datatype, int dest, int tag, intptr_t comm)
{
   return PMPI_Send(buf, count, (MPI_Datatype)datatype, dest, tag, (MPI_Comm)comm);
}

In your code:

#include "mympi.h" // instead of mpi.h

...
MPI_Init(NULL, NULL);
...
MPI_Send(buf, 10, MPI_INT, 1, 10, MPI_COMM_WORLD);
...

The MPI wrapper can either be linked statically or preloaded dynamically. Both ways work as long as the MPI implementation uses weak symbols for the PMPI interface. You can extend the above code example to cover all the MPI functions and constants used. All constants should be saved in the wrapper of MPI_Init / MPI_Init_thread.

Handling MPI_Status is somehow convoluted. Although the standard defines the public fields, it doesn't say anything about their order or their placement within the structure. And once again, MPICH and Open MPI differ significantly:

// MPICH (Intel MPI)
typedef struct MPI_Status {
    int count_lo;
    int count_hi_and_cancelled;
    int MPI_SOURCE;
    int MPI_TAG;
    int MPI_ERROR;
} MPI_Status;

// Open MPI
struct ompi_status_public_t {
    /* These fields are publicly defined in the MPI specification.
       User applications may freely read from these fields. */
    int MPI_SOURCE;
    int MPI_TAG;
    int MPI_ERROR;
    /* The following two fields are internal to the Open MPI
       implementation and should not be accessed by MPI applications.
       They are subject to change at any time.  These are not the
       droids you're looking for. */
    int _cancelled;
    size_t _ucount;
};

If you only use MPI_Status to get information out of calls such as MPI_Recv, then it is trivial to copy the three public fields into a user-defined static structure containing only those fields. But that won't suffice if you are also using MPI functions that read the non-public ones, e.g. MPI_Get_count. In that case, a dumb non-OO approach is to simply embed the original status structure:

In mytypes.h:

// 64 bytes should cover most MPI implementations
#define MY_MAX_STATUS_SIZE 64

typedef struct my_MPI_Status
{
   int MPI_SOURCE;
   int MPI_TAG;
   int MPI_ERROR;
   char _original[MY_MAX_STATUS_SIZE];
} my_MPI_Status;

In mympi.h:

#define MPI_Status        my_MPI_Status
#define MPI_STATUS_IGNORE ((my_MPI_Status*)NULL)

extern int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Status *status);
extern int MPI_Get_count(MPI_Status *status, MPI_Datatype datatype, int *count);

In mpiwrap.c:

int MPI_Recv(void *buf, int count, my_MPI_Datatype datatype, int dest, int tag, my_MPI_Comm comm, my_MPI_Status *status)
{
   MPI_Status *real_status = (status != NULL) ? (MPI_Status*)&status->_original : MPI_STATUS_IGNORE;
   int res = PMPI_Recv(buf, count, (MPI_Datatype)datatype, dest, tag, (MPI_Comm)comm, real_status);
   if (status != NULL)
   {
      status->MPI_SOURCE = real_status->MPI_SOURCE;
      status->MPI_TAG = real_status->MPI_TAG;
      status->MPI_ERROR = real_status->MPI_ERROR;
   }
   return res;
}

int MPI_Get_count(my_MPI_Status *status, my_MPI_Datatype datatype, int *count)
{
   MPI_Status *real_status = (status != NULL) ? (MPI_Status*)&status->_original : MPI_STATUS_IGNORE;
   return PMPI_Get_count(real_status, (MPI_Datatype)datatype, count);
}

In your code:

#include "mympi.h"

...
MPI_Status status;
int count;

MPI_Recv(buf, 100, MPI_INT, 0, 10, MPI_COMM_WORLD, &status);
MPI_Get_count(&status, MPI_INT, &count);
...

Your build system should then check if sizeof(MPI_Status) of the actual MPI implementation is less than or equal to MY_MAX_STATUS_SIZE.

The above is just a quick and dirty idea - haven't tested it and some const or casts might be missing here or there. It should work in practice and be pretty maintainable.

like image 97
Hristo Iliev Avatar answered Sep 19 '22 10:09

Hristo Iliev