Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Atomicity in C++ : Myth or Reality

I have been reading an article about Lockless Programming in MSDN. It says :

On all modern processors, you can assume that reads and writes of naturally aligned native types are atomic. As long as the memory bus is at least as wide as the type being read or written, the CPU reads and writes these types in a single bus transaction, making it impossible for other threads to see them in a half-completed state.

And it gives some examples:

// This write is not atomic because it is not natively aligned. DWORD* pData = (DWORD*)(pChar + 1); *pData = 0;  // This is not atomic because it is three separate operations. ++g_globalCounter;  // This write is atomic. g_alignedGlobal = 0;  // This read is atomic. DWORD local = g_alignedGlobal; 

I read lots of answers and comments saying, nothing is guaranteed to be atomic in C++ and it is not even mentioned in standarts, in SO and now I am a bit confused. Am I misinterpreting the article? Or does the article writer talk about things that are non-standart and specific to MSVC++ compiler?

So according to the article the below assignments must be atomic, right?

struct Data {     char ID;     char pad1[3];     short Number;     char pad2[2];     char Name[5];     char pad3[3];     int Number2;     double Value; } DataVal;  DataVal.ID = 0; DataVal.Number = 1000; DataVal.Number2 = 0xFFFFFF; DataVal.Value = 1.2; 

If it is true, does replacing Name[5] and pad3[3] with std::string Name; make any difference in memory-alignment ? Will the assignments to Number2 and Value variables be still atomic?

Can someone please explain?

like image 829
ali_bahoo Avatar asked Feb 15 '11 09:02

ali_bahoo


2 Answers

This recommendation is architecture-specific. It is true for x86 & x86_64 (in a low-level programming). You should also check that compiler don't reorder your code. You can use "compiler memory barrier" for that.

Low-level atomic read and writes for x86 is described in Intel Reference manuals "The Intel® 64 and IA-32 Architectures Software Developer’s Manual" Volume 3A ( http://www.intel.com/Assets/PDF/manual/253668.pdf) , section 8.1.1

8.1.1 Guaranteed Atomic Operations

The Intel486 processor (and newer processors since) guarantees that the following basic memory operations will always be carried out atomically:

  • Reading or writing a byte
  • Reading or writing a word aligned on a 16-bit boundary
  • Reading or writing a doubleword aligned on a 32-bit boundary

The Pentium processor (and newer processors since) guarantees that the following additional memory operations will always be carried out atomically:

  • Reading or writing a quadword aligned on a 64-bit boundary
  • 16-bit accesses to uncached memory locations that fit within a 32-bit data bus

The P6 family processors (and newer processors since) guarantee that the following additional memory operation will always be carried out atomically:

  • Unaligned 16-, 32-, and 64-bit accesses to cached memory that fit within a cache line

This document also have more description of atomically for newer processors like Core2. Not all unaligned operations will be atomic.

Other intel manual recommends this white paper:

http://software.intel.com/en-us/articles/developing-multithreaded-applications-a-platform-consistent-approach/

like image 88
osgx Avatar answered Sep 20 '22 12:09

osgx


I think you are misinterpreting the quote.

Atomicity can be guaranteed on a given architecture, using specific instructions (proper to this architecture). The MSDN article explains that read and writes on C++ built-in types can be expected to be atomic on x86 architecture.

However the C++ standard does not presume what the architecture is, therefore the Standard cannot make such guarantees. Indeed C++ is used in embedded software where the hardware support is much more limited.

C++0x defines the std::atomic template class, which allows to turn reads and writes into atomic operations, whatever the type. The compiler will select the best way to obtain atomicity based on the type characteristics and the architecture targeted in a standard compliant manner.

The new standard also defines a whole lot of operations similar to MSVC InterlockExchange that is also compiled to the fastest (yet safe) available primitives offered by the hardware.

like image 22
Matthieu M. Avatar answered Sep 17 '22 12:09

Matthieu M.