Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do memory fences slow down all CPU cores?

Somewhere, one time I read about memory fences (barriers). It was said that memory fence causes cache synchronisation between several CPU cores.

So my questions are:

  1. How does the OS (or CPU itself) know which cores need to be synchronised?

  2. Does it synchronise cache of all CPU cores?

  3. If answer to (2) is 'yes' and assuming that sync operations are not cheap, does using memory fences slow down cores that are not used by my application? If for example I have a single threaded app running on my 8-core CPU, will it slow down all other 7 cores of the CPU, because some cache lines must be synced with all those cores?

  4. Are the questions above totally ignorant and fences work completely differently?

like image 588
GreenScape Avatar asked Sep 13 '14 09:09

GreenScape


2 Answers

  1. The OS does not need to know, and each CPU core does what it's told: each core with a memory fence has to do certain operations before or after, and that's all. A core isn't synchronizing "with" other cores, it's synchronizing memory accesses relative to itself.
  2. A fence in one core does not mean other cores are synchronized with it, so typically you would have two (or more) fences: one in the writer and one in the reader. A fence executed on one core does not need to impact any other cores. Of course there is no guarantee about this in general, just a hope that sane architectures will not unduly serialize multi-core execution.
like image 125
John Zwinck Avatar answered Oct 19 '22 11:10

John Zwinck


Generally, memory fences are used for ordering local operations. Take for instance this pseudo-assembler code:

load A
load B

Many CPU's do not guarantee that B is indeed loaded after A, B may be in a cache line that was loaded into cache earlier due to some other memory load. If you introduce a fence,

load A
readFence
load B

you have the guarantee that B is loaded from memory after A is. If B were in cache but older than A, it would be reloaded.

The situation with stores is the same the other way around. With

store A
store B

some CPUs may decide to write B to memory before they write A. Again, a fence between the two instructions may be needed to enforce ordering of the operations. Whether a memory fence is required always depends on the architecture.


Generally, you use memory fences in pairs:

  • If one thread wants to publish an object, it first constructs the object, then it performs a write fence before it writes the pointer to the object into a publicly known location.

  • The thread that wants to receive the object, reads the pointer from the publicly know memory location, then it executes a read fence to ensure that all further reads based on that pointer actually give the values the publishing thread intended.

If either fence is missing, the reader may read the value of one or more data members of the object before it was initialized. Madness ensues.

like image 40
cmaster - reinstate monica Avatar answered Oct 19 '22 12:10

cmaster - reinstate monica