Boost's ASIO dispatcher seems to have a serious problem, and I can't seem to find a workaround. The symptom is that the only thread waiting to dispatch is left in pthread_cond_wait
feven though there are I/O operations pending that require it to block in epoll_wait
.
I can most easily replicate this issue by having one thread call poll_one
in a loop until it returns zero. This can leave the thread calling run
stuck in pthread_cond_wait
while the thread calling poll_one
breaks out of the loop. Presumably, the io_service is expecting that thread to return to block in epoll_wait
, but it's under no obligation to do so and that expectation seems fatal.
Is there a requirement that threads be statically associated with io_service
s?
Here's an example showing the deadlock. This is the only thread handling this io_service because the others have moved on. There are definitely socket operations pending:
#0 pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 boost::asio::detail::posix_event::wait<boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex> > (...) at /usr/include/boost/asio/detail/posix_event.hpp:80
#2 boost::asio::detail::task_io_service::do_run_one (...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:405
#3 boost::asio::detail::task_io_service::run (...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:146
I believe the bug is as follows: If a thread servicing an I/O queue is the thread that's blocking on the I/O socket readiness check and it calls to a dispatch function, if there are any other threads blocked on the io service, it must signal. It currently only signals if there are handlers ready to run at that time. But that leaves no thread checking for socket readiness.
This is a bug. I have been able to duplicate it by adding a delay into the non-critical section of task_io_service::do_poll_one
. Here is a snippet of the modified task_io_service::do_poll_one()
in booost/asio/detail/impl/task_io_service.ipp
. The only line added is the sleep.
std::size_t task_io_service::do_poll_one(mutex::scoped_lock& lock,
task_io_service::thread_info& this_thread,
const boost::system::error_code& ec)
{
if (stopped_)
return 0;
operation* o = op_queue_.front();
if (o == &task_operation_)
{
op_queue_.pop();
lock.unlock();
{
task_cleanup c = { this, &lock, &this_thread };
(void)c;
// Run the task. May throw an exception. Only block if the operation
// queue is empty and we're not polling, otherwise we want to return
// as soon as possible.
task_->run(false, this_thread.private_op_queue);
boost::this_thread::sleep_for(boost::chrono::seconds(3));
}
o = op_queue_.front();
if (o == &task_operation_)
return 0;
}
...
My test driver is fairly basic:
io_service
.io_service
, and have main call io_service::run()
while the poll thread sleeps in task_io_service::do_poll_one()
.Test code:
#include <iostream>
#include <boost/asio/io_service.hpp>
#include <boost/asio/steady_timer.hpp>
#include <boost/chrono.hpp>
#include <boost/thread.hpp>
boost::asio::io_service io_service;
boost::asio::steady_timer timer(io_service);
void arm_timer()
{
std::cout << ".";
std::cout.flush();
timer.expires_from_now(boost::chrono::seconds(3));
timer.async_wait(boost::bind(&arm_timer));
}
int main()
{
// Add asynchronous work loop.
arm_timer();
// Spawn poll thread.
boost::thread poll_thread(
boost::bind(&boost::asio::io_service::poll, boost::ref(io_service)));
// Give time for poll thread service reactor.
boost::this_thread::sleep_for(boost::chrono::seconds(1));
io_service.run();
}
And the debug:
[twsansbury@localhost bug]$ gdb a.out ... (gdb) r Starting program: /home/twsansbury/dev/bug/a.out [Thread debugging using libthread_db enabled] .[New Thread 0xb7feeb90 (LWP 31892)] [Thread 0xb7feeb90 (LWP 31892) exited]
At this point, the arm_timer()
has printed "." once (when it was intially armed). The poll thread serviced the reactor in a non-blocking manner, and slept for 3 seconds while op_queue_
was empty (task_operation_
will be added back to the op_queue_
when task_cleanup c
exits scope). While the op_queue_
was empty, the main thread calls io_service::run()
, sees the op_queue_
is empty, and makes itself the first_idle_thread_
, where it waits on its wakeup_event
. The poll thread finishes sleeping, and returns 0
, leaving the main thread waiting on wakeup_event
.
After waiting 10~ seconds, plenty of time for the arm_timer()
to be ready, I interrupt the debugger:
Program received signal SIGINT, Interrupt. 0x00919402 in __kernel_vsyscall () (gdb) bt #0 0x00919402 in __kernel_vsyscall () #1 0x0081bbc5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #2 0x00763b3d in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libc.so.6 #3 0x08059dc2 in void boost::asio::detail::posix_event::wait >(boost::asio::detail::scoped_lock&) () #4 0x0805a009 in boost::asio::detail::task_io_service::do_run_one(boost::asio::detail::scoped_lock&, boost::asio::detail::task_io_service_thread_info&, boost::system::error_code const&) () #5 0x0805a11c in boost::asio::detail::task_io_service::run(boost::system::error_code&) () #6 0x0805a1e2 in boost::asio::io_service::run() () #7 0x0804db78 in main ()
The side-by-side timeline is as follows:
poll thread | main thread ---------------------------------------+--------------------------------------- lock() | do_poll_one() | |-- pop task_operation_ from | | queue_op_ | |-- unlock() | lock() |-- create task_cleanup | do_run_one() |-- service reactor (non-block) | `-- queue_op_ is empty |-- ~task_cleanup() | |-- set thread as idle | |-- lock() | `-- unlock() | `-- queue_op_.push( | | task_operation_) | `-- task_operation_ is | queue_op_.front() | `-- return 0 | // still waiting on wakeup_event unlock() |
As best as I could tell, there are no side effects by patching:
if (o == &task_operation_)
return 0;
to:
if (o == &task_operation_)
{
if (!one_thread_)
wake_one_thread_and_unlock(lock);
return 0;
}
Regardless, I have submitted a bug and fix. Consider keeping an eye on the ticket for an official response.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With