I'm trying to cancel a boost::asio::basic_waitable_timer<std::chrono::steady_clock>
safely.
According to this answer, this code should do that work:
timer.get_io_service().post([&]{timer.cancel();})
I'm afraid it doesn't work for me.
Am I doing something wrong?
This is my code:
#include <iostream>
#include "boost/asio.hpp"
#include <chrono>
#include <thread>
#include <random>
boost::asio::io_service io_service;
boost::asio::basic_waitable_timer<std::chrono::steady_clock> timer(io_service);
std::atomic<bool> started;
void handle_timeout(const boost::system::error_code& ec)
{
if (!ec) {
started = true;
std::cerr << "tid: " << std::this_thread::get_id() << ", handle_timeout\n";
timer.expires_from_now(std::chrono::milliseconds(10));
timer.async_wait(&handle_timeout);
} else if (ec == boost::asio::error::operation_aborted) {
std::cerr << "tid: " << std::this_thread::get_id() << ", handle_timeout aborted\n";
} else {
std::cerr << "tid: " << std::this_thread::get_id() << ", handle_timeout another error\n";
}
}
int main() {
std::cout << "tid: " << std::this_thread::get_id() << ", Hello, World!" << std::endl;
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> dis(1, 100);
for (auto i = 0; i < 1000; i++) {
started = false;
std::thread t([&](){
timer.expires_from_now(std::chrono::milliseconds(0));
timer.async_wait(&handle_timeout);
io_service.run();
});
while (!started) {};
auto sleep = dis(gen);
std::cout << "tid: " << std::this_thread::get_id() << ", i: " << i << ", sleeps for " << sleep << " [ms]" << std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(sleep));
timer.get_io_service().post([](){
std::cerr << "tid: " << std::this_thread::get_id() << ", cancelling in post\n";
timer.cancel();
});
// timer.cancel();
std::cout << "tid: " << std::this_thread::get_id() << ", i: " << i << ", waiting for thread to join()" << std::endl;
t.join();
io_service.reset();
}
return 0;
}
This is the output:
...
tid: 140737335076608, handle_timeout
tid: 140737335076608, handle_timeout
tid: 140737353967488, i: 2, waiting for thread to join()
tid: 140737335076608, cancelling in post
tid: 140737335076608, handle_timeout aborted
tid: 140737353967488, i: 3, sleeps for 21 [ms]
tid: 140737335076608, handle_timeout
tid: 140737353967488, i: 3, waiting for thread to join()
tid: 140737335076608, handle_timeout
tid: 140737335076608, cancelling in post
tid: 140737335076608, handle_timeout
tid: 140737335076608, handle_timeout
tid: 140737335076608, handle_timeout
tid: 140737335076608, handle_timeout
tid: 140737335076608, handle_timeout
...
continue forever...
As you can see, the timer.cancel()
is being called from the appropriate thread:
tid: 140737335076608, cancelling in post
BUT there's no
tid: 140737335076608, handle_timeout aborted
Afterwards.
Main waits forever.
The cancellation is safe.
It's just not robust. You didn't account for the case when the timer wasn't pending. You cancel it once, then, but it will just start a new async wait once the completion handler is invoked.
What follows is my detailed steps on how I traced the issue.
SUMMARY TL;DR
Cancelling a time only cancels asynchronous operations in flight.
If you want to shutdown an asynchronous call chain, you'll have to use additional logic for that. An example is given below.
Enabling with
#define BOOST_ASIO_ENABLE_HANDLER_TRACKING 1
This produces output that can be visualized with boost/libs/asio/tools/handlerviz.pl
:
As you can see, the async_wait
is in-flight when the cancellation happens.
(truncated because it would run infinitely)
Note how the completion handler sees cc=system:0
, not cc=system:125
(for operation_aborted
). This is a symptom of the fact that the posted cancel did not actually "take". The only logical explanation (not visible in the diagram) is that the timer had already expired before the cancel gets invoked.
Let's compare the raw traces¹
¹ removing the noisy difference
So, we have a lead. Can we detect it?
timer.get_io_service().post([](){
std::cerr << "tid: " << std::this_thread::get_id() << ", cancelling in post\n";
if (timer.expires_from_now() >= std::chrono::steady_clock::duration(0)) {
timer.cancel();
} else {
std::cout << "PANIC\n";
timer.cancel();
}
});
Prints:
tid: 140113177143232, i: 0, waiting for thread to join()
tid: 140113177143232, i: 1, waiting for thread to join()
tid: 140113177143232, i: 2, waiting for thread to join()
tid: 140113177143232, i: 3, waiting for thread to join()
tid: 140113177143232, i: 4, waiting for thread to join()
tid: 140113177143232, i: 5, waiting for thread to join()
tid: 140113177143232, i: 6, waiting for thread to join()
tid: 140113177143232, i: 7, waiting for thread to join()
tid: 140113177143232, i: 8, waiting for thread to join()
tid: 140113177143232, i: 9, waiting for thread to join()
tid: 140113177143232, i: 10, waiting for thread to join()
tid: 140113177143232, i: 11, waiting for thread to join()
tid: 140113177143232, i: 12, waiting for thread to join()
tid: 140113177143232, i: 13, waiting for thread to join()
tid: 140113177143232, i: 14, waiting for thread to join()
tid: 140113177143232, i: 15, waiting for thread to join()
tid: 140113177143232, i: 16, waiting for thread to join()
tid: 140113177143232, i: 17, waiting for thread to join()
tid: 140113177143232, i: 18, waiting for thread to join()
tid: 140113177143232, i: 19, waiting for thread to join()
tid: 140113177143232, i: 20, waiting for thread to join()
tid: 140113177143232, i: 21, waiting for thread to join()
tid: 140113177143232, i: 22, waiting for thread to join()
tid: 140113177143232, i: 23, waiting for thread to join()
tid: 140113177143232, i: 24, waiting for thread to join()
tid: 140113177143232, i: 25, waiting for thread to join()
tid: 140113177143232, i: 26, waiting for thread to join()
PANIC
Could we communicate the "super-cancellation" in another, clearer way? We have ... just the timer
object to work with, of course:
The timer
object doesn't have a lot of properties to work with. There's no close()
or similar, like on a socket, that can be used to put the timer in some kind of invalid state.
However, there's the expiry timepoint, and we can use a special domain value to signal "invalid" for our application:
timer.get_io_service().post([](){
std::cerr << "tid: " << std::this_thread::get_id() << ", cancelling in post\n";
// also cancels:
timer.expires_at(Timer::clock_type::time_point::min());
});
This "special value" is easy to handle in the completion handler:
void handle_timeout(const boost::system::error_code& ec)
{
if (!ec) {
started = true;
if (timer.expires_at() != Timer::time_point::min()) {
timer.expires_from_now(std::chrono::milliseconds(10));
timer.async_wait(&handle_timeout);
} else {
std::cerr << "handle_timeout: detected shutdown\n";
}
}
else if (ec != boost::asio::error::operation_aborted) {
std::cerr << "tid: " << std::this_thread::get_id() << ", handle_timeout error " << ec.message() << "\n";
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With