The Message Passing Interface APIs always use int
as a type for count
variables. For instance, the prototype for MPI_Send
is:
int MPI_Send(const void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm);
This may be a problem if the number of elements to be sent or received grows near or even beyond INT_MAX
.
Of course the issue may be solved lowering the value of count
by either:
MPI_Datatype
Both approaches are anyhow more an hack than a real solution, especially if implemented with simple heuristics. What I would like to ask is therefore:
Is there a better idiom to treat these kind of cases with standard MPI calls? If not, does anybody know of some (solid) wrapper library built around MPI to overcome this limitation?
I am the lead developer of BigMPI and co-authored a paper entitled To INT_MAX... and beyond!: exploring large-count support in MPI that discusses this exact topic in far more detail than space permits here.
If you cannot access the ACM DL freely, you can download the Argonne preprint or checkout the paper source repo.
Here are the key highlights from this effort:
BigMPI is a relatively high quality interface to MPI that supports 64b integer counts (the type is technically MPI_Count
but MPI_Aint
is used internally). Ironically, it does not make use of the MPI-3 large-count features. This is because BigMPI is not completely general, but rather aims to support the most common usage models.
BigMPI was designed in part to be educational. It employs the ultra-permissive MIT License to make it possible for anyone to copy code from it into another project, possibly with changes to meet an unforeseen need.
Exceeding INT_MAX in the MPI-3 interface isn't just slightly a problem. It's invalid ISO C code. The rollover behavior of signed integers is - unlike unsigned integers - undefined. So the primary problem isn't with MPI, it's with the fact that a C integer cannot hold numbers larger than INT_MAX. It is a matter of debate if it is a problem with MPI that the count argument is specified to be the C int
type, as opposed to size_t
, for example. Before saying it's obvious that MPI should have switched to size_t
, you need to understand the history of MPI and the importance of ABI compatibility to a subset of MPI users.
Even with BigMPI or similar datatype-based methods, implementations may have bugs. This means that doing the standard-compliant thing will not work, because internally an MPI implementation might improperly store something like count*sizeof(type)
into a 32b value, which can overflow for a valid count like one billion if sizeof(type)
is eight, for example. As noted in the aforementioned paper, in addition to these bugs - which appear to be absent in recent versions of MPICH and Open-MPI - there are bugs in POSIX functions that must be mitigated.
The situation with Fortran is more complicated. Fortran default integer size is not specified and MPI implementations should, in theory, respect whatever the compiler uses. However, this is often not the case in practice. I believe many MPI implementations are broken for counts above INT_MAX due to the use of C int
internally. BigMPI does not have a Fortran interface, although I have some desire to write one some day. Until then, please pester MPI implementers to do the right thing w.r.t. Fortran INTEGER
casting to C types internally.
Anyways, I do not wish to transcribe the entire contents of our paper into this post, particularly since it is freely available, as is the source code. If you feel this post is inadequate, please comment and I'll try to add more later.
Finally, BigMPI is research code and I would not say it is finished (however, you should not hit the unfinished code). Users are strongly encouraged to perform their own correctness testing of BigMPI and the MPI implementation prior to use in production.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With