Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is function call a memory barrier?

Tags:

Consider this C code:

extern volatile int hardware_reg;

void f(const void *src, size_t len)
{
    void *dst = <something>;

    hardware_reg = 1;    
    memcpy(dst, src, len);    
    hardware_reg = 0;
}

The memcpy() call must occur between the two assignments. In general, since the compiler probably doesn't know what will the called function do, it can't reorder the call to the function to be before or after the assignments. However, in this case the compiler knows what the function will do (and could even insert an inline built-in substitute), and it can deduce that memcpy() could never access hardware_reg. Here it appears to me that the compiler would see no trouble in moving the memcpy() call, if it wanted to do so.

So, the question: is a function call alone enough to issue a memory barrier that would prevent reordering, or is, otherwise, an explicit memory barrier needed in this case before and after the call to memcpy()?

Please correct me if I am misunderstanding things.

like image 680
Andrey Vihrov Avatar asked Apr 17 '11 11:04

Andrey Vihrov


People also ask

What is memory barrier in operating system?

A memory barrier is an instruction that requires the processor to apply an ordering constraint between memory operations that occur before and after the memory barrier instruction in the program. Such instructions are also known as memory fences in other architectures.

Is volatile a memory barrier?

The keyword volatile does not guarantee a memory barrier to enforce cache-consistency. Therefore, the use of volatile alone is not sufficient to use a variable for inter-thread communication on all systems and processors.

How are memory barriers implemented?

Memory barrier is implemented by the hardware processor. CPUs with different architectures have different memory barrier instructions. Therefore, the programmer needs to explicitly call memory barrier in the code to solve the preceding problem.

Can compiler reorder function calls?

It depends on "no dependencies". If you mean that the difference can not be observed, then yes, the compiler (and even the CPU itself) is free to reorder the operations.


1 Answers

The compiler cannot reorder the memcpy() operation before the hardware_reg = 1 or after the hardware_reg = 0 - that's what volatile will ensure - at least as far as the instruction stream the compiler emits. A function call is not necessarily a 'memory barrier', but it is a sequence point.

The C99 standard says this about volatile (5.1.2.3/5 "Program execution"):

At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred.

So at the sequence point represented by the memcpy(), the volatile access of writing 1 has to occurred, and the volatile access of writing 0 cannot have occurred.

However, there are 2 things I'd like to point out:

  1. Depending on what <something> is, if nothing else is done with the the destination buffer, the compiler might be able to completely remove the memcpy() operation. This is the reason Microsoft came up with the SecureZeroMemory() function. SecureZeroMemory() operates on volatile qualified pointers to prevent optimizing writes away.

  2. volatile doesn't necessarily imply a memory barrier (which is a hardware thing, not just a code ordering thing), so if you're running on a multi-proc machine or certain types of hardware you may need to explicitly invoke a memory barrier (perhaps wmb() on Linux).

    Starting with MSVC 8 (VS 2005), Microsoft documents that the volatile keyword implies the appropriate memory barrier, so a separate specific memory barrier call may not be necessary:

    • http://msdn.microsoft.com/en-us/library/12a04hfd.aspx

    Also, when optimizing, the compiler must maintain ordering among references to volatile objects as well as references to other global objects. In particular,

    • A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary.

    • A read of a volatile object (volatile read) has Acquire semantics; a reference to a global or static object that occurs after a read of volatile memory in the instruction sequence will occur after that volatile read in the compiled binary.

like image 97
Michael Burr Avatar answered Sep 25 '22 08:09

Michael Burr