Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is a pthread mutex considered "slower" than a futex?

Why are POSIX mutexes considered heavier or slower than futexes? Where is the overhead coming from in the pthread mutex type? I've heard that pthread mutexes are based on futexes, and when uncontested, do not make any calls into the kernel. It seems then that a pthread mutex is merely a "wrapper" around a futex.

Is the overhead simply in the function-wrapper call and the need for the mutex function to "setup" the futex (i.e., basically the setup of the stack for the pthread mutex function call)? Or are there some extra memory barrier steps taking place with the pthread mutex?

like image 757
Jason Avatar asked Jun 15 '11 21:06

Jason


People also ask

What is the difference between mutex and futex?

Futex is not mutex Both result in kernel calls, except in one case: The Wait call takes two arguments: one is the address of a user-defined integer variable, the other is the expected value of that variable. If the values don't match, the call returns immediately.

Does pthread use futex?

Modern day user-mode pthread mutex uses the futex kernel syscall [1] to implement the lock, which avoids the syscall in the non-contended case, so it can be very fast, acquiring the lock in the user-mode code entirely.

Why futex?

DESCRIPTION top. The futex() system call provides a method for waiting until a certain condition becomes true. It is typically used as a blocking construct in the context of shared-memory synchronization. When using futexes, the majority of the synchronization operations are performed in user space.

How does a futex work?

Simply stated, a futex is a queue the kernel manages for userspace convenience. It lets userspace code ask the kernel to suspend until a certain condition is satisfied, and lets other userspace code signal that condition and wake up waiting processes.


1 Answers

Futexes were created to improve the performance of pthread mutexes. NPTL uses futexes, LinuxThreads predated futexes, which I think is where the "slower" consideration comes. NPTL mutexes may have some additional overhead, but it shouldn't be much.

Edit: The actual overhead basically consists on:

  • selecting the correct algorithm for the mutex type (normal, recursive, adaptive, error-checking; normal, robust, priority-inheritance, priority-protected), where the code heavily hints to the compiler that we are likely using a normal mutex (so it should convey that to the CPU's branch prediction logic),
  • and a write of the current owner of the mutex if we manage to take it which should normally be fast, since it resides in the same cache-line as the actual lock which we have just taken, unless the lock is heavily contended and some other CPU accessed the lock between the time we took it and when we attempted to write the owner (this write is unneeded for normal mutexes, but needed for error-checking and recursive mutexes).

So, a few cycles (typical case) to a few cycles + a branch misprediction + an additional cache miss (very worst case).

like image 75
ninjalj Avatar answered Sep 27 '22 00:09

ninjalj