Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding dimensions, coordinates and ordering of processes in a 2D MPI grid

Tags:

c

mpi

I have 3 questions, all related to MPI (in C). I think the first 2 have the same answer, but I'm not positive.


Question 1

When using MPI_Dims_create to create a 2D grid, which dimension of the returned dimensions is X and which is Y?

For example:

int numProcs = -1, myProcID = -1;
MPI_Comm_rank(MPI_COMM_WORLD, &myProcID);
MPI_Comm_size(MPI_COMM_WORLD, &numProcs);

int size[2];
size[0] = size[1] = 0;
MPI_Dims_create(numProcs, 2, size);
int numProcs_y = -1, numProcs_x = 1;

Should it be:

numProcs_y = size[0];
numProcs_x = size[1];

Or this:

numProcs_x = size[0];
numProcs_y = size[1];

I tried running this with numbers of processes that seemed like they would give me the answer (e.g. 6). With 6 processes, MPI_Dims_create should either create a grid with 3 rows and 2 columns or 2 rows and 3 columns (unless I've misunderstood the documentation). For process 3 (of 6), I found that size[0] = 1 and size[1] = 0 (I believe this corresponds to x = 0, y = 1). This seems to indicate to me that MPI_Dims_create is creating a grid with 3 rows and 2 columns (because it was 2 rows and 3 columns, process 2 (of 6) should have x = 2, y = 0). Any confirmation someone could provide on this would be greatly appreciated.


Question 2

When using MPI_Cart_coords on a 2D Cartesian grid, which of the returned dimensions is X and which is Y?

For example:

int periodic[2];
periodic[0] = periodic[1] = 0; // no wrap around
MPI_Comm cart_comm;
int coords[2];

// using size from question 1
MPI_Cart_create(MPI_COMM_WORLD, 2, size, periodic, 1, &cart_comm);
MPI_Cart_coords(cart_comm, myProcID, 2, coords);

Similar to question 1, my question is, should it be like this

myProcID_y = coords[0];
myProcID_x = coords[1];

or like this

myProcID_x = coords[0];
myProcID_y = coords[1];

I've been searching through the documentation and previous questions on here, but I can't seem to find a direct answer to this question.

The documentation seems to indicate that the first of the two approaches is the correct one, but it doesn't definitively state it.


Question 3

The underlying question behind the first 2 questions is that I'm trying to split the 2D grid into rows and columns. However, when I use MPI_Comm_rank to check the row and column IDs of each process after doing so, I'm getting the processes ordered in an order that doesn't match with what I think the answers to the above questions are.

Based on above, I expect the processes to be ordered like this:

P0 P1
P2 P3
P4 P5

However, using this code (which comes after the above code in my program, so all the above code is accessible from it: I'm trying to separate out my code to make it easier to isolate my questions):

MPI_Comm row_comm, col_comm;
int row_id = -1, col_id = -1;
// ** NOTE: My use of myProcID_y and myProcID_x here are based 
// on my understanding of the previous 2 questions ... if my 
// understanding to one/both of those is wrong, then obviously the assignments 
// here are wrong too.
// create row and column communicators based on my location in grid
MPI_Comm_split(cart_comm, myProcID_y, myProcID_x, &row_comm);
MPI_Comm_split(cart_comm, myProcID_x, myProcID_y, &col_comm);

// get row and column ID for each process
MPI_Comm_rank(row_comm, &row_id);
MPI_Comm_rank(col_comm, &col_id);
printf("Process: %d\trowID: %d\tcolID: %d\n", myProcID, row_id, col_id);

I'm seeing the following printed:

Process: 0 rowID: 0        colID: 0
Process: 1 rowID: 1        colID: 0
Process: 2 rowID: 0        colID: 1
Process: 3 rowID: 1        colID: 1
Process: 4 rowID: 0        colID: 2
Process: 5 rowID: 1        colID: 2

which seems to correspond to the following order of processes:

P0 P2 P4
P1 P3 P5

which is the opposite matrix dimensions (2 rows, 3 columns) from what I was expecting from MPI_Dims_create.

Assuming my understanding to the first 2 questions is correct (i.e. that Y is the first dimension returned by those functions), why are the processes (seemingly) being ordered in a different order at this step?

like image 361
Matt Sinclair Avatar asked Dec 08 '12 22:12

Matt Sinclair


1 Answers

Q1 and Q2. There are no such things as dimension X and dimension Y in MPI - these are labels that you give to the abstract dimensions used in the standard. MPI works with numbered dimensions and follows the C row-major numbering of the ranks, i.e. in a 2x3 Cartesian topology (0,0) maps to rank 0, (0,1) maps to rank 1, (0,2) maps to rank 2, (1,0) maps to rank 3, and so on.

Note that dimension 0 corresponds to the rightmost element in the coordinate tuple. This is in reverse to the numbering of elements in C arrays and often a source of confusion. To create a 2x3 Cartesian topology the size array would have to be initialised as:

int size[2] = { 3, 2 };

It is up to you to map the abstract numbered dimensions to your problem. You may chose dimension 0 to be X, or you may chose dimension 1 - it doesn't matter.

As for MPI_DIMS_CREATE, the standard says:

The dimensions are set to be as close to each other as possible, using an appropriate divisibility algorithm.

For dims[i] set by the call, dims[i] will be ordered in non-increasing order.

This operation simply returns an array of elements dims[i] that have the following properties (unless the size of one or more dimensions is fixed by setting a non-zero value(s) in dims[] before calling MPI_DIMS_CREATE):

  • dims[0] >= dims[1] >= dims[2] >= ...
  • dims[0] * dims[1] * dims[2] == nprocs, where nprocs is the number of processes, specified to MPI_DIMS_CREATE.

This means that as MPI_DIMS_CREATE decomposes the set of nprocs into an multidimensional grid, it would assign the biggest multiplier to the size of dimension 0, the next one to the size of dimension 1, and so on. As 6 factors like 2*3, then MPI_DIMS_CREATE would return { 3, 2 }. If you call MPI_CART_CREATE directly with the result from MPI_DIMS_CREATE, it would create a 2x3 topology with coordinates and ranks:

(0,0)=0 (0,1)=1 (0,2)=2
(1,0)=3 (1,1)=4 (1,2)=5

Q3. MPI provides a special routine to partition Cartesian topologies - MPI_CART_SUB. It takes an array of logical flags (integers in C), named remain_dims in the standard. Each non-zero remain_dims[i] means that that dimension i should be retained in the resulting partitioning while separate subcommunicators would be created for any possible combination of the non-retained dimensions. For example, given the 2x3 topology:

  • remain_dims[] = { 1, 0 } would retain dimension 0 and result in 2 non-overlapping 1-d communicators with 3 processes each;
  • remain_dims[] = { 0, 1 } would retain dimension 1 result in 3 non-overlapping 1-d communicators with 2 processes each;
  • remain_dims[] = { 0, 0 } would not retain any of the two dimensions and will result in 6 non-overlapping zero-dimensional communicators with a single process in each.

Which partitioning you would call row-wise and which one you would call a column-wise partitioning is up to you and your labelling of the Cartesian dimensions.


One thing to note is that one often runs MPI codes on systems that consist of networked SMP or NUMA multicore nodes. In this case a suitable 2D grid would be nodes x cores. If the number of cores is known, then one can easily fix it in the call to MPI_DIMS_CREATE:

int size[2] = { ncores, 0 };

MPI_Dims_create(numProcs, 2, size);

This is a more convenient than dividing numProcs by ncores and checking for divisibility as MPI_DIMS_CREATE would signal an error if ncores does not divide numProcs. Then a partitioning that keeps dimension 0, i.e. one with

int remain_dims[2] = { 1, 0 };

would create subcommunicators that contain processes on the same node, while

int remain_dims[2] = { 0, 1 };

would create subcommunicators that contain no two processes from the same node.


Note that in your code you have specified a value of 1 (true) to the reorder parameter in MPI_CART_CREATE. This might lead to processes having different ranks in MPI_COMM_WORLD and in the Cartesian communicator. Hence there is no guarantee that the following lines of code do what you expect them to:

MPI_Cart_coords(cart_comm, myProcID, 2, coords);
                           ^^^^^^^^
...
printf("Process: %d\trowID: %d\tcolID: %d\n", myProcID, row_id, col_id);
                                              ^^^^^^^^

myProcID was obtained from MPI_COMM_WORLD and might actually differ from the new rank in cart_comm, hence it should not be used to obtain the process coordinates and to perform the splits.

like image 96
Hristo Iliev Avatar answered Oct 03 '22 19:10

Hristo Iliev