I've got an MPI program consisting of one master process that hands off commands to a bunch of slave processes. Upon receiving a command, a slave just calls system() to do it. While the slaves are waiting for a command, they are consuming 100% of their respective CPUs. It appears that Probe() is sitting in a tight loop, but that's only a guess. What do you think might be causing this, and what could I do to fix it?
Here's the code in the slave process that waits for a command. Watching the log and the top command at the same time suggests that when the slaves are consuming their CPUs, they are inside this function.
MpiMessage
Mpi::BlockingRecv() {
LOG(8, "BlockingRecv");
MpiMessage result;
MPI::Status status;
MPI::COMM_WORLD.Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, status);
result.source = status.Get_source();
result.tag = status.Get_tag();
int num_elems = status.Get_count(MPI_CHAR);
char buf[num_elems + 1];
MPI::COMM_WORLD.Recv(
buf, num_elems, MPI_CHAR, result.source, result.tag
);
result.data = buf;
LOG(7, "BlockingRecv about to return (%d, %d)", result.source, result.tag);
return result;
}
It sounds like there are three ways to wait for an MPI message:
MPI_Iprobe()
in a loop with a sleep call before calling a blocking receive. I find a 100ms sleep is responsive enough for my tasks, and still keeps the CPU usage minimal when a worker is idle.I did some searching and found that a busy wait is what you want if you are not sharing your processors with other tasks.
Yes; most MPI implementations, for the sake of performance, busy-wait on blocking operations. The assumption is that the MPI job is the only thing going on that we care about on the processor, and if the task is blocked waiting for communications, the best thing to do is to continually poll for that communication to reduce latency; so that there's virtually no delay between when the message arrives and when it's handed off to the MPI task. This typically means that CPU is pegged at 100% even when nothing "real" is being done.
That's probably the best default behaviour for most MPI users, but it isn't always what you want. Typically MPI implementations allow turning this off; with OpenMPI, you can turn this behaviour off with an MCA parameter,
mpirun -np N --mca mpi_yield_when_idle 1 ./a.out
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With