I have a problem with Boost Asio on OS X, where the io_service destructor sometimes hangs indefinitely. I have a relatively simple repro case:
#include <boost/asio.hpp>
#include <boost/thread.hpp>
int main(int argc, char* argv[]) {
timeval tv;
gettimeofday(&tv, 0);
std::time_t t = tv.tv_sec;
std::tm curr;
// The call to gmtime_r _seems_ innocent, but I cannot reproduce without this
std::tm* curr_ptr = gmtime_r(&t, &curr);
{
boost::asio::io_service ioService;
boost::asio::deadline_timer timer(ioService);
ioService.post([&](){
// This will also call gmtime_r, but just calling that is not enough
timer.expires_from_now(boost::posix_time::milliseconds(1));
timer.async_wait([](const boost::system::error_code &) {});
});
ioService.post([&](){
ioService.post([&](){});
});
// Run some threads
boost::thread_group workers;
for (auto i=0; i<3; ++i) {
workers.create_thread([&](){ ioService.run(); });
}
workers.join_all();
} // hangs here in the io_service destructor
return 0;
}
Basically, this just posts two handlers on the queue, one of which schedules a timer and the other just posts another handler. Sometimes this simple program causes the io_service
destructor to hang indefinitely, in particular in the pipe_select_interrupter
destructor during the kqueue_reactor
destruction. This blocks in the system call close()
on the pipe read descriptor.
To trigger the error I invoke the program in a loop using a shell script (but it is possible to trigger using a loop in the example above as well):
#!/bin/csh
set yname="foo"
while ( $yname != "" )
date
./hangtest
end
I am no longer able to reproduce if I:
gmtime_r()
in the beginning (!). Edit: This only appears to apply if I run using the script. If I instead add a loop in the program itself I can reproduce it without that call as well, as per the comment by ruslo.async_wait()
on the timer in the handler or move the timer setup outside of the handler.post()
in the second handler.kqueue_reactor::interrupt()
. This function is invoked from both the async_wait()
and the post()
, and calls kevent()
with the read descriptor that is then not possible to close.Am I doing something wrong in the above code?
I am running on OS X 10.8.5 with Boost 1.54 and compiling with clang -stdlib=libc++ -std=c++11
. I can also reproduce with Boost Asio from Boost 1.55 (with the rest of Boost 1.54 kept as-is).
Edit: I can reproduce on OS X 10.9.1 as well (using the same executable).
The fix for this was committed to Asio in the master branch April 29th 2014
Fix occasional close() system call hang on MacOS.
Repeated re-registration of kqueue event filters seems to behave as though there is some kind of "leak" on MacOS, culminating in a suspended close() system call and an unkillable process. To avoid this, we will register a descriptor's kqueue event filters once only i.e. when the descriptor is first created.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With