Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can ASIO timer `cancel()` call a spurious "success"?

The ASIO documentation for basic_deadline_timer::cancel() has the following remarks section:

If the timer has already expired when cancel() is called, then the handlers for asynchronous wait operations will:

  • have already been invoked; or
  • have been queued for invocation in the near future.

These handlers can no longer be cancelled, and therefore are passed an error code that indicates the successful completion of the wait operation.

The emphasis has been added by me. Normally when you call cancel() on a timer, the callback is run with an error code of "operation cancelled by user". But this says there is a small chance it could actually be called with a success error code. I think it is trying to say that the following could happen:

  1. Thread A calls async_wait(myTimerHandler) on a timer, where myTimerHandler() is a user callback function.
  2. Thread B calls io_context::post(cancelMyTimer) where cancelMyTimer() is a user callback function. This is now queued up to be called in thread A.
  3. The timer deadline expires, so ASIO queues up the timer callback handler, with a success error code. It doesn't call it yet, but it is queued up to be called in thread A.
  4. ASIO gets round to calling cancelMyTimer() in thread A, which calls cancel() on the timer. But the timer already fired, and ASIO doesn't check that the handler is still queued up and not executed, so this does nothing.
  5. ASIO now calls myTimerHandler, and doesn't check that cancel() was called in the meantime, and so it still passes success as the error code.

Bear in mind this example only has a single thread calling io_context::run(), deadline_timer::async_wait or deadline_timer::cancel(). The only thing that happened in another thread was a call to post(), which happened in an attempt to avoid any race conditions. Is this sequence of events possible? Or is it referring to some multithreading scenario (that seems unlikely given that timers are not thread safe)?

Context: If you have a timer that you wish to repeat periodically, then the obvious thing to do is check the error code in the callback, and set the timer again if the code is success. If the above race is possible, then it would be necessary to have a separate variable saying whether you cancelled the timer, which you update in addition to calling cancel().

like image 614
Arthur Tacca Avatar asked Oct 20 '25 17:10

Arthur Tacca


2 Answers

You don't even need a second thread to run into a situation where basic_waitable_timer::cancel() is invoked too late (because the timer's (completion) handler is already queued).

It's sufficient that your program executes some other asynchronous operations concurrently to the not yet resumed basic_waitable_timer::async_wait(). If you then only rely on basic_waitable_timer::cancel() for cancellation then the cancel() call from another asynchronous (completion) handler races with an already scheduled async_wait() handler:

If the timer has already expired when cancel() is called, then the handlers for asynchronous wait operations will:

  • have already been invoked; or
  • have been queued for invocation in the near future.

These handlers can no longer be cancelled, and therefore are passed an error code that indicates the successful completion of the wait operation.

(basic_waitable_timer::cancel(), emphasis mine, i.e. the race condition is due to the second case)

A real-world example that is single-threaded (i.e. the program doesn't explicitly start any threads and only invokes io_server.run() once) and contains the described race:

    void Fetch_Timer::resume()
    {
      timer_.expires_from_now(std::chrono::seconds(1));
      timer_.async_wait([this](const boost::system::error_code &ec)
          {
            BOOST_LOG_FUNCTION();
            if (ec) {
              if (ec.value() == boost::asio::error::operation_aborted)
                return;
              THROW_ERROR(ec);
            } else {
              print();
              resume();
            }
          });
    }
    void Fetch_Timer::stop()
    {
      print();
      timer_.cancel();
    }

(Source: imapdl/copy/fetch_timer.cc)

In this example, the obvious fix (i.e. also querying a boolean flag) doesn't even need to use any synchronization primitives (such as atomics), because the program is single-threaded. That means it executes (asynchronous) operations concurrently but not in parallel.

(FWIW, in the above example, the bug manifested itself only every 2 years or so, even under daily usage)

like image 92
maxschlepzig Avatar answered Oct 25 '25 09:10

maxschlepzig


Everything you stated is correct. So in your situation you could need a separate variable to indicate you don’t want to continue the loop. I normally used a atomic_bool and I don’t bother posting a cancel routine, I just set the bool & call cancel from whatever thread I am on.

UPDATE:

The source for my answer is mainly experience in using ASIO for years and for understanding the asio codebase enough to fix problems and extend parts of it when required.

Yes the documentation says that the it's not thread safe between shared instances of the deadline_timer, but the documentation is not the best (what documentation is...). If you look at the source for how the "cancel" works we can see:

Boost Asio version 1.69: boost\asio\detail\impl\win_iocp_io_context.hpp

template <typename Time_Traits>
std::size_t win_iocp_io_context::cancel_timer(timer_queue<Time_Traits>& queue,
    typename timer_queue<Time_Traits>::per_timer_data& timer,
    std::size_t max_cancelled)
{
  // If the service has been shut down we silently ignore the cancellation.
  if (::InterlockedExchangeAdd(&shutdown_, 0) != 0)
    return 0;

  mutex::scoped_lock lock(dispatch_mutex_);
  op_queue<win_iocp_operation> ops;
  std::size_t n = queue.cancel_timer(timer, ops, max_cancelled);
  post_deferred_completions(ops);
  return n;
}

You can see that the cancel operation is guarded by a mutex lock so the "cancel" operation is thread safe.

Calling most of the other operations on deadline timer is not (in regards to calling them at the same time from multiple threads).

Also I think you are correct about the restarting of timers in quick order. I don't normally have a use case for stopping and starting timers in that sort of fashion, so I've never needed to do that.

like image 21
Shane Powell Avatar answered Oct 25 '25 09:10

Shane Powell