Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MPI datatype for 2 d array

Tags:

arrays

types

2d

mpi

I need to pass an array of integer arrays (basically a 2 d array )to all the processors from root.I am using MPI in C programs. How to declare MPI datatype for 2 d array.and how to send the message (should i use broadcast or scatter)

like image 468
dks Avatar asked Apr 16 '10 08:04

dks


People also ask

Why does MPI define data types?

Why Datatypes? Since all data is labeled by type, an MPI implementation can support communication between processes on machines with very different memory representations and lengths of elementary datatypes (heterogeneous communication).


1 Answers

You'll need to use Broadcast, because you want to send a copy of the same message to every process. Scatter breaks up a message and distributes the chunks between processes.

As for how to send the data: the HIndexed datatype is for you.

Suppose your 2d array is defined like this:

int N;            // number of arrays (first dimension)
int sizes[N];     // number of elements in each array (second dimensions)
int* arrays[N];   // pointers to the start of each array

First you have to calculate the displacement of each array's starting address, relative to the starting address of the datatype, which can be the starting address of the first array to make things convenient:

MPI_Aint base;
MPI_Address(arrays[0], &base);
MPI_Aint* displacements = new int[N];
for (int i=0; i<N; ++i)
{
    MPI_Address(arrays[i], &displacements[i]);
    displacements[i] -= base;
}

Then the definition for your type would be:

MPI_Datatype newType;
MPI_Type_hindexed(N, sizes, displacements, MPI_INTEGER, &newType);
MPI_Type_commit(&newType);

This definition will create a datatype that contains all your arrays packed one after the other. Once this is done, you just send your data as a single object of this type:

MPI_Bcast(arrays, 1, newType, root, comm);   // 'root' and 'comm' is whatever you need

However, you're not done yet. The receiving processes will need to know the sizes of the arrays you're sending: if that knowledge isn't available at compile time, you'll have to send a separate message with that data first (simple array of ints). If N, sizes and arrays are defined similar as above on the receiving processes, with enough space allocated to fill the arrays, then all the receiving processes need to do is define the same datatype (exact same code as the sender), and then receive the sender's message as a single instance of that type:

MPI_Bcast(arrays, 1, newType, root, comm);    // 'root' and 'comm' must have the same value as in the sender's code

And voilá! All processes now have a copy of your array.

Of course, things get a lot easier if the 2nd dimension of your 2d array is fixed to some value M. In that case, the easiest solution is to simply store it in a single int[N*M] array: C++ will guarantee that it's all contiguous memory, so you can broadcast it without defining a custom datatype, like this:

MPI_Bcast(arrays, N*M, MPI_INTEGER, root, comm);

Note: you might get away with using the Indexed type instead of HIndexed. The difference is that in Indexed, the displacements array is given in number of elements, while in HIndexed it's the number of bytes (H stands for Heterogenous). If you were to use Indexed, then the values given in displacements would have to be divided by sizeof(int). However, I'm not sure if integer arrays defined in arbitrary positions on the heap are guaranteed to "line up" to integer limits in C++, and in any case, the HIndexed version has (marginally) less code and produces the same result.

like image 124
suszterpatt Avatar answered Sep 27 '22 23:09

suszterpatt