Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is std::atomic_compare_exchange_weak thread-unsafe by design?

Tags:

c++

c++11

atomic

It was brought up on cppreference atomic_compare_exchange Talk page that the existing implementations of std::atomic_compare_exchange_weak compute the boolean result of the CAS with a non-atomic compare instruction, e.g.

    lock
    cmpxchgq   %rcx, (%rsp)
    cmpq       %rdx, %rax

which (Edit: apologies for the red herring)

break CAS loops such as Concurrency in Action's listing 7.2:

while(!head.compare_exchange_weak(new_node->next, new_node);

The specification (29.6.5[atomics.types.operations.req]/21-22) seems to imply that the result of the comparison must be a part of the atomic operation:

Effects: atomically compares ...

Returns: the result of the comparison

but is it actually implementable? Should we file bug reports to the vendors or to the LWG?

like image 384
Cubbi Avatar asked Feb 19 '14 11:02

Cubbi


2 Answers

TL;DR: atomic_compare_exchange_weak is safe by design, but actual implementations are buggy.

Here's the code that Clang actually generates for this little snippet:

struct node {
  int data;
  node* next;
};

std::atomic<node*> head;

void push(int data) {
  node* new_node = new node{data};
  new_node->next = head.load(std::memory_order_relaxed);
  while (!head.compare_exchange_weak(new_node->next, new_node,
      std::memory_order_release, std::memory_order_relaxed)) {}
}

Result:

  movl  %edi, %ebx
  # Allocate memory
  movl  $16, %edi
  callq _Znwm
  movq  %rax, %rcx
  # Initialize with data and 0
  movl  %ebx, (%rcx)
  movq  $0, 8(%rcx) ; dead store, should have been optimized away
  # Overwrite next with head.load
  movq  head(%rip), %rdx
  movq  %rdx, 8(%rcx)
  .align  16, 0x90
.LBB0_1:                                # %while.cond
                                        # =>This Inner Loop Header: Depth=1
  # put value of head into comparand/result position
  movq  %rdx, %rax
  # atomic operation here, compares second argument to %rax, stores first argument
  # in second if same, and second in %rax otherwise
  lock
  cmpxchgq  %rcx, head(%rip)
  # unconditionally write old value back to next - wait, what?
  movq  %rax, 8(%rcx)
  # check if cmpxchg modified the result position
  cmpq  %rdx, %rax
  movq  %rax, %rdx
  jne .LBB0_1

The comparison is perfectly safe: it's just comparing registers. However, the whole operation is not safe.

The critical point is this: the description of compare_exchange_(weak|strong) says:

Atomically [...] if true, replace the contents of the memory point to by this with that in desired, and if false, updates the contents of the memory in expected with the contents of the memory pointed to by this

Or in pseudo-code:

if (*this == expected)
  *this = desired;
else
  expected = *this;

Note that expected is only written to if the comparison is false, and *this is only written to if comparison is true. The abstract model of C++ does not allow an execution where both are written to. This is important for the correctness of push above, because if the write to head happens, suddenly new_node points to a location that is visible to other threads, which means other threads can start reading next (by accessing head->next), and if the write to expected (which aliases new_node->next) also happens, that's a race.

And Clang writes to new_node->next unconditionally. In the case where the comparison is true, that's an invented write.

This is a bug in Clang. I don't know whether GCC does the same thing.

In addition, the wording of the standard is suboptimal. It claims that the entire operation must happen atomically, but this is impossible, because expected is not an atomic object; writes to there cannot happen atomically. What the standard should say is that the comparison and the write to *this happen atomically, but the write to expected does not. But this isn't that bad, because no one really expects that write to be atomic anyway.

So there should be a bug report for Clang (and possibly GCC), and a defect report for the standard.

like image 174
Sebastian Redl Avatar answered Oct 18 '22 14:10

Sebastian Redl


I was the one who originally found this bug. For the last few days I have been e-mailing Anthony Williams regarding this issue and vendor implementations. I didn't realize Cubbi had raise a StackOverFlow question. It's not just Clang or GCC it's every vendor that is broken (all that matters anyway). Anthony Williams also author of Just::Thread (a C++11 thread and atomic library) confirmed his library is implemented correctly (only known correct implementation).

Anthony has raised a GCC bug report http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60272

Simple example:

   #include <atomic>
   struct Node { Node* next; };
   void Push(std::atomic<Node*> head, Node* node)
   {
       node->next = head.load();
       while(!head.compare_exchange_weak(node->next, node))
           ;
   }

g++ 4.8 [assembler]

       mov    rdx, rdi
       mov    rax, QWORD PTR [rdi]
       mov    QWORD PTR [rsi], rax
   .L3:
       mov    rax, QWORD PTR [rsi]
       lock cmpxchg    QWORD PTR [rdx], rsi
       mov    QWORD PTR [rsi], rax !!!!!!!!!!!!!!!!!!!!!!!
       jne    .L3
       rep; ret

clang 3.3 [assembler]

       movq    (%rdi), %rcx
       movq    %rcx, (%rsi)
   .LBB0_1:
       movq    %rcx, %rax
       lock
       cmpxchgq    %rsi, (%rdi)
       movq    %rax, (%rsi) !!!!!!!!!!!!!!!!!!!!!!!
       cmpq    %rcx, %rax !!!!!!!!!!!!!!!!!!!!!!!
       movq    %rax, %rcx
       jne    .LBB0_1
       ret

icc 13.0.1 [assembler]

       movl      %edx, %ecx
       movl      (%rsi), %r8d
       movl      %r8d, %eax
       lock
       cmpxchg   %ecx, (%rdi)
       movl      %eax, (%rsi) !!!!!!!!!!!!!!!!!!!!!!!
       cmpl      %eax, %r8d !!!!!!!!!!!!!!!!!!!!!!!
       je        ..B1.7
   ..B1.4:
       movl      %edx, %ecx
       movl      %eax, %r8d
       lock
       cmpxchg   %ecx, (%rdi)
       movl      %eax, (%rsi) !!!!!!!!!!!!!!!!!!!!!!!
       cmpl      %eax, %r8d !!!!!!!!!!!!!!!!!!!!!!!
       jne       ..B1.4
   ..B1.7:
       ret

Visual Studio 2012 [No need to check assembler, MS uses _InterlockedCompareExchange !!!]

   inline int _Compare_exchange_seq_cst_4(volatile _Uint4_t *_Tgt, _Uint4_t *_Exp, _Uint4_t _Value)
   {    /* compare and exchange values atomically with
       sequentially consistent memory order */
       int _Res;
       _Uint4_t _Prev = _InterlockedCompareExchange((volatile long
*)_Tgt, _Value, *_Exp);
       if (_Prev == *_Exp) !!!!!!!!!!!!!!!!!!!!!!!
           _Res = 1;
       else
       { /* copy old value */
           _Res = 0;
           *_Exp = _Prev;
       }
       return (_Res);
   }
like image 38
duncan.forster Avatar answered Oct 18 '22 16:10

duncan.forster