Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spinlock implementation reasoning

I want to improve the performance of a program by replacing some of the mutexes with spinlocks. I have found a spinlock implementation in

  • http://www.boost.org/doc/libs/1_36_0/boost/detail/spinlock_sync.hpp

which I intend to reuse. I believe this implementation is safer than simpler implementations in which threads keep trying forever like the one found here

  • http://www.boost.org/doc/libs/1_54_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.example_spinlock.implementation

But i need to clarify some things on the yield function found here

  • http://www.boost.org/doc/libs/1_36_0/boost/detail/yield_k.hpp

First of all I can assume that the numbers 4,16,32 are arbitrary. I actually tested some other values and I have found that I got best performance in my case by using other values.

But can someone explain the reasoning behind the yield code. Specifically why do we need all three

  • BOOST_SMT_PAUSE
  • sched_yield and
  • nanosleep
like image 830
sotiris Avatar asked Sep 02 '25 06:09

sotiris


1 Answers

Yes, this concept is known as "adaptive spinlock" - see e.g. https://lwn.net/Articles/271817/.

Usually the numbers are chosen for exponential back-off: https://geidav.wordpress.com/tag/exponential-back-off/

So, the numbers aren't arbitrary. However, which "numbers" work for your case depend on your application patterns, requirements and system resources.

The three methods to introduce "micro-delays" are designed explicitly to balance the cost and the potential gain:

  • zero-cost is to spin on high-CPU, but it results in high power consumption and wasted cycles
  • a small "cheap" delay might be able to prevent the cost of a context-switch while reducing the CPU load relative to a busy-spin
  • a simple yield might allow the OS to avoid a context switch depending on other system load (e.g. if the number of threads < number logical cores)

The trade-offs with these are important for low-latency applications where the effect of a context switch or cache misses are significant.

TL;DR

All trade-offs try to find a balance between wasting CPU cycles and losing cache/thread efficiency.

like image 110
sehe Avatar answered Sep 05 '25 01:09

sehe