Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does pthread_mutex_lock contains memory fence instruction? [duplicate]

Do pthread_mutex_lock and pthread_mutex_unlock functions call memory fence/barrier instructions? Or do the the lower level instructions like compare_and_swap implicity have memory barriers?

like image 551
MetallicPriest Avatar asked Jun 10 '14 09:06

MetallicPriest


People also ask

How does memory fence work?

Memory fence is a type of barrier instruction that causes a CPU or compiler to enforce ordering constraint on memory operations issued before and after the memory fence instruction. This typically means that operations issued prior to the fence are guaranteed to performed before operations issued after the fence.

Is mutex a memory barrier?

So locking a mutex and immediately unlocking it acts as a memory barrier, albeit a horribly inefficient one since it forces serial execution. Save this answer. Show activity on this post. Mutex and other lock in kernel uses the barrier internally to ensure that code runs in the exact order as expected.


2 Answers

Do pthread_mutex_lock and pthread_mutex_unlock functions call memory fence/barrier instructions?

They do, as well as thread creation.

Note, however, there are two types of memory barriers: compiler and hardware.

Compiler barriers only prevent the compiler from reordering reads and writes and speculating variable values, but don't prevent the CPU from reordering.

The hardware barriers prevent the CPU from reordering reads and writes. Full memory fence is usually the slowest instruction, most of the time you only need operations with acquire and release semantics (to implement spinlocks and mutexes).

With multi-threading you need both barriers most of the time.

Any function whose definition is not available in this translation unit (and is not intrinsic) is a compiler memory barrier. pthread_mutex_lock, pthread_mutex_unlock, pthread_create also issue a hardware memory barrier to prevent the CPU from reordering reads and writes.

From Programming with POSIX Threads by David R. Butenhof:

Pthreads provides a few basic rules about memory visibility. You can count on all implementations of the standard to follow these rules:

  1. Whatever memory values a thread can see when it calls pthread_create can also be seen by the new thread when it starts. Any data written to memory after the call to pthread_create may not necessarily be seen by the new thread, even if the write occurs before the thread starts.

  2. Whatever memory values a thread can see when it unlocks a mutex, either directly or by waiting on a condition variable, can also be seen by any thread that later locks the same mutex. Again, data written after the mutex is unlocked may not necessarily be seen by the thread that locks the mutex, even if the write occurs before the lock.

  3. Whatever memory values a thread can see when it terminates, either by cancellation, returning from its start function, or by calling pthread_exit, can also be seen by the thread that joins with the terminated thread bycalling pthread_join. And, of course, data written after the thread terminates may not necessarily be seen by the thread that joins, even if the write occurs before the join.

  4. Whatever memory values a thread can see when it signals or broadcasts a condition variable can also be seen by any thread that is awakened by that signal or broadcast. And, one more time, data written after the signal or broadcast may not necessarily be seen by the thread that wakes up, even if the write occurs before it awakens.

Also see C++ and Beyond 2012: Herb Sutter - atomic<> Weapons for more details.

like image 157
Maxim Egorushkin Avatar answered Oct 01 '22 16:10

Maxim Egorushkin


Please take a look at section 4.12 of the POSIX specification.

Applications shall ensure that access to any memory location by more than one thread of control (threads or processes) is restricted such that no thread of control can read or modify a memory location while another thread of control may be modifying it. Such access is restricted using functions that synchronize thread execution and also synchronize memory with respect to other threads. [emphasis mine]

Then a list of functions is given which synchronize memory, plus a few additional notes.

If that requires memory barrier instructions on some architecture, then those must be used.

About compare_and_swap: that isn't in POSIX; check the documentation for whatever you are using. For instance, IBM defines a compare_and_swap function for AIX 5.3. which doesn't have full memory barrier semantics The documentation note says:

If compare_and_swap is used as a locking primitive, insert an isync at the start of any critical sections.

From this documentation we can guess that IBM's compare_and_swap has release semantics: since the documentation does not require a barrier for the end of the critical section. The acquiring processor needs to issue an isync to make sure it is not reading stale data, but the publishing processor doesn't have to do anything.

At the instruction level, some processors have compare and swap with certain synchronizing guarantees, and some don't.

like image 20
Kaz Avatar answered Oct 01 '22 15:10

Kaz