Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does statement re-ordering apply to conditional/control statements?

As described in other posts, without any sort of volatile or std::atomic qualification, the compiler and/or processor is at liberty to re-order the sequence of statements (e.g. assignments):

// this code
int a = 2;
int b = 3;
int c = a;
b = c;

// could be re-ordered/re-written as the following by the compiler/processor
int c = 2;
int a = c;
int b = a;

However, are conditionals and control statements (e.g. if, while, for, switch, goto) also allowed to be re-ordered, or are they considered essentially a "memory fence"?

int* a = &previously_defined_int;
int b = *a;

if (a == some_predefined_ptr)
{
   a = some_other_predefined_ptr; // is this guaranteed to occur after "int b = *a"?
}

If the above statements could be re-ordered (e.g. store a in a temporary register, update a, then populate b by de-reference the "old" a in the temporary register), which I guess they could be while still satisfying the same "abstract machine" behavior in a single-threaded environment, then why aren't there problems when using locks/mutexes?

bool volatile locked = false; // assume on given implementation, "locked" reads/writes occur in 1 CPU instruction
                              // volatile so that compiler doesn't optimize out

void thread1(void)
{
    while (locked) {}
    locked = true;
    // do thread1 things // couldn't these be re-ordered in front of "locked = true"?
    locked = false;
}

void thread2(void)
{
    while (locked) {}
    locked = true;
    // do thread2 things // couldn't these be re-ordered in front of "locked = true"?
    locked = false;
}

Even if std::atomic was used, non-atomic statements can still be re-ordered around atomic statements, so that wouldn't help ensure that "critical section" statements (i.e. the "do threadX things") were contained within their intended critical section (i.e. between the locking/unlocking).


Edit: Actually, I realize the lock example doesn't actually have anything to do with the conditional/control statement question I asked. It would still be nice to have clarification on both of the questions asked though:

  • re-ordering within and around conditional/control statements
  • are locks/mutexes of the form given above robust?
    • so far the answer from the comments is, "no, because there is a race condition between the while() check and claiming the lock", but apart from that, I am also wondering about the placement of thread function code outside of the "critical section"
like image 259
abc Avatar asked Jun 14 '26 07:06

abc


2 Answers

The "as-if" rule ([intro.abstract]) is important to note here:

The semantic descriptions in this document define a parameterized nondeterministic abstract machine. This document places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below

Anything* can be reordered so long as the implementation can guarantee that the observable behavior of the resulting program is unchanged.

Thread synchronization constructs in general cannot be implemented properly without fences and preventing reordering. For example, the standard guarantees that lock/unlock operations on mutexes shall behave as atomic operations. Atomics also explicitly introduce fences, especially with respect to the memory_order specified. This means that statements depending on an (unrelaxed) atomic operation cannot be reordered, otherwise the observable behavior of the program may change.

[intro.races] talks a great deal about data races and ordering.

like image 139
AndyG Avatar answered Jun 16 '26 21:06

AndyG


Branches are very much the opposite of a memory fence in assembly. Speculative execution + out-of-order exec means that control dependencies are not data dependencies, so for example if(x) tmp = y; can load y without waiting for a cache miss on x. See memory model, how load acquire semantic actually works?

Of course in C++, this just means that no, if() doesn't help. The definitions of (essentially deprecated) memory_order_consume might even specify that if isn't a data dependency. Real compilers promote it to acquire because it's too hard to implement as originally specified, though.

So TL:DR: you still need atomics with mo_acquire and mo_release (or stronger) if you want to establish a happens-before between two threads. Using a relaxed variable in an if() doesn't help at all, and in fact makes reordering in practice on real CPUs easier.

And of course non-atomic variables aren't safe without synchronization. An if(data_ready.load(acquire)) is sufficient to protect access to a non-atomic variable, though. So is a mutex; mutex lock/unlock count as acquire and release operations on the mutex object, according to C++ definitions. (Many practical implementations involve full barriers, but formally C++ only guarantees acq and rel for mutexes)

like image 32
Peter Cordes Avatar answered Jun 16 '26 20:06

Peter Cordes