Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::atomic_bool for cancellation flag: is std::memory_order_relaxed the correct memory order?

I have a thread that reads from a socket and generates data. After every operation, the thread checks a std::atomic_bool flag to see if it must exit early.

In order to cancel the operation, I set the cancellation flag to true, then call join() on the worker thread object.

The code of the thread and the cancellation function looks something like this:

std::thread work_thread;
std::atomic_bool cancel_requested{false};

void thread_func()
{
   while(! cancel_requested.load(std::memory_order_relaxed))
      process_next_element();

}

void cancel()
{
    cancel_requested.store(true, std::memory_order_relaxed);
    work_thread.join();
}

Is std::memory_order_relaxed the correct memory order for this use of an atomic variable?

like image 843
Paul Belanger Avatar asked Dec 06 '18 14:12

Paul Belanger


2 Answers

As long as there is no dependency between cancel_requested flag and anything else, you should be safe.

The code as shown looks OK, assuming you use cancel_requested only to expedite the shutdown, but also have a provision for an orderly shutdown, such as a sentinel entry in the queue (and of course that the queue itself is synchronized).

Which means your code actually looks like this:

std::thread work_thread;
std::atomic_bool cancel_requested{false};
std::mutex work_queue_mutex;
std::condition_variable work_queue_filled_cond;
std::queue work_queue;

void thread_func()
{
    while(! cancel_requested.load(std::memory_order_relaxed))
    {
        std::unique_lock<std::mutex> lock(work_queue_mutex);
        work_queue_filled_cond.wait(lock, []{ return !work_queue.empty(); });
        auto element = work_queue.front();
        work_queue.pop();
        lock.unlock();
        if (element == exit_sentinel)
            break;
        process_next_element(element);
    }
}

void cancel()
{
    std::unique_lock<std::mutex> lock(work_queue_mutex);
    work_queue.push_back(exit_sentinel);
    work_queue_filled_cond.notify_one();
    lock.unlock();
    cancel_requested.store(true, std::memory_order_relaxed);
    work_thread.join();
}

And if we're that far, then cancel_requested may just as well become a regular variable, the code even becomes simpler.

std::thread work_thread;
bool cancel_requested = false;
std::mutex work_queue_mutex;
std::condition_variable work_queue_filled_cond;
std::queue work_queue;

void thread_func()
{
    while(true)
    {
        std::unique_lock<std::mutex> lock(work_queue_mutex);
        work_queue_filled_cond.wait(lock, []{ return cancel_requested || !work_queue.empty(); });
        if (cancel_requested)
            break;
        auto element = work_queue.front();
        work_queue.pop();
        lock.unlock();
        process_next_element(element);
    }
}

void cancel()
{
    std::unique_lock<std::mutex> lock(work_queue_mutex);
    cancel_requested = true;
    work_queue_filled_cond.notify_one();
    lock.unlock();
    work_thread.join();
}

memory_order_relaxed is generally hard to reason about, because it blurs the general notion of sequentially executing code. So the usefulness of it is very, very limited as Herb explains in his atomic weapons talk.

Note std::thread::join() by itself acts as a memory barrier between the two threads.

like image 153
rustyx Avatar answered Sep 28 '22 07:09

rustyx


Whether this code is correct depends on a lot of things. Most of all it depends on what exactly you mean by "correct". As far as I can tell, the bits of code that you show don't invoke undefined behavior (assuming your work_thread and cancel_requested are not actually initialized in the order your snippet above suggests as you would then have the thread potentially reading the uninitialized value of the atomic). If all you need to do is change the value of that flag and have the thread eventually see the new value at some point independent of whatever else may be going on, then std::memory_order_relaxed is sufficient.

However, I see that your worker thread calls a process_next_element() function. That suggests that there is some mechanism through which the worker thread receives elements to process. I don't see any way for the thread to exit when all elements have been processed. What does process_next_element() do when there's no next element available right away? Does it just return immediately? In that case you've got yourself a busy wait for more input or cancellation, which will work but is probably not ideal. Or does process_next_element() internally call some function that blocks until an element becomes available!? If that is the case, then cancelling the thread would have to involve first setting the cancellation flag and then doing whatever is needed to make sure the next element call your thread is potentially blocking on returns. In this case, it's potentially essential that the thread can never see the cancellation flag after the blocking call returns. Otherwise, you could potentially have the call return, go back into the loop, still read the old cancellation flag and then go call process_next_element() again. If process_next_element() is guaranteed to just return again, then you're fine. If that is not the case, you have a deadlock. So I believe it technically depends on what exactly process_next_element() does. One could imagine an implementation of process_next_element() where you would potentially need more than relaxed memory order. However, if you already have a mechanism for fetching new elements to process, why even use a separate cancellation flag? You could simply handle cancellation through that same mechanism, e.g., by having it return a next element with a special value or return no element at all to signal cancellation of processing and cause the thread to return instead of relying on a separate flag…

like image 31
Michael Kenzel Avatar answered Sep 28 '22 08:09

Michael Kenzel