Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenMPI Reduce using MINLOC

I'm currently working on some MPI code for a graph theory problem in which a number of nodes can each contain an answer and the length of that answer. To get everything back to the master node I'm doing an MPI_Gather for the answers and am attempting to do an MPI_Reduce using the MPI_MINLOC operation to figure out who had the shortest solution. Right now my datatype that stores the length and node ID is defined as (per examples shown on numerous sites like http://www.open-mpi.org/doc/v1.4/man3/MPI_Reduce.3.php):

struct minType
{
    float len;
    int index;
};

On each node I'm initializing the local copies of this struct in the following manner:

int commRank;
MPI_Comm_rank (MPI_COMM_WORLD, &commRank);
minType solutionLen;
solutionLen.len = 1e37;
solutionLen.index = commRank;

At the end of the execution I have an MPI_Gather call that successfully pulls down all of the solutions (I've printed them out from in memory to verify them), and the call:

MPI_Reduce (&solutionLen, &solutionLen, 1, MPI_FLOAT_INT, MPI_MINLOC, 0, MPI_COMM_WORLD);

It's my understanding that the arguments are supposed to be:

  1. The data source
  2. is the target for the result (only significant on the designated root node)
  3. The number of items sent by each node
  4. The datatype (MPI_FLOAT_INT appears to be defined based on the above link)
  5. The operation (MPI_MINLOC appears to be defined as well)
  6. The root node's ID in the specified comm group
  7. The communications group to wait on.

When my code makes it to the reduce operation I get this error:

[compute-2-19.local:9754] *** An error occurred in MPI_Reduce
[compute-2-19.local:9754] *** on communicator MPI_COMM_WORLD
[compute-2-19.local:9754] *** MPI_ERR_ARG: invalid argument of some other kind
[compute-2-19.local:9754] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 9754 on
node compute-2-19.local exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

I'll admit to being completely stumped on this. In case it matters I'm compiling using OpenMPI 1.5.3 (built using gcc 4.4) on a Rocks cluster based on CentOS 5.5.

like image 273
jthecie Avatar asked Nov 23 '11 21:11

jthecie


People also ask

What does MPI_ reduce DO?

MPI_Reduce combines the elements provided in the input buffer of each process in the group, using the operation op, and returns the combined value in the output buffer of the process with rank root.

What is Minloc value process?

The operator MPI_MINLOC is used to compute a global minimum and also an index attached to the minimum value. MPI_MAXLOC similarly computes a global maximum and index. One application of these is to compute a global minimum (maximum) and the rank of the process containing this value.


1 Answers

I think you are not allowed to use the same buffer for input and output (first two arguments). The man page says:

When the communicator is an intracommunicator, you can perform a reduce operation in-place (the output buffer is used as the input buffer). Use the variable MPI_IN_PLACE as the value of the root process sendbuf. In this case, the input data is taken at the root from the receive buffer, where it will be replaced by the output data.

like image 110
Walter Avatar answered Sep 27 '22 20:09

Walter