Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

purpose of memory barriers in linux kernel

Robert Love says that "set_task_state (task, state) sets the given task to the given state. If applicable, it also provides a memory barrier to force ordering on other processors (This is only needed on SMP systems) Otherwise it is equivalent to task->state = state

My question is: How a memory barrier can force ordering on other processors?

What does robert love mean by this - Why is this required? What is this ordering he might be talking about? Is he talking of scheduling queues here?

If so, does every processor in SMP have a different scheduling queue? I am confused

like image 403
S22 Avatar asked Jun 18 '15 11:06

S22


People also ask

Why do we use memory barriers?

Memory barriers are typically used when implementing low-level machine code that operates on memory shared by multiple devices. Such code includes synchronization primitives and lock-free data structures on multiprocessor systems, and device drivers that communicate with computer hardware.

What is Linux barrier?

Barrier is software that mimics the functionality of a KVM switch, which historically would allow you to use a single keyboard and mouse to control multiple computers by physically turning a dial on the box to switch the machine you're controlling at any given moment.

What is data memory barrier?

Data Memory Barrier (DMB) prevents reordering of data accesses instructions across the DMB instruction.

What is Smp_mb?

Description. smp_mb() Similar to mb(), but only guarantees ordering between cores/processors within an SMP system. All memory accesses before the smp_mb() will be visible to all cores within the SMP system before any accesses after the smp_mb().


2 Answers

Your CPU, to squeeze out extra performance, does Out of Order Execution, which can run operations in a different order than they are given in the code. An optimizing compiler can change the order of operations to make code faster. Compiler writers/kernel types have to take care not to change expectations (or at least conform to the spec so they can say your expectation isn't right)

Here's an example

1: CPU1: task->state = someModifiedStuff
2: CPU1: changed = 1;
3: CPU2: if (changed)
4: CPU2:  ...

If we didn't have a barrier for setting state we could reorder 1 and 2. Since neither references the other a single-threaded implementation wouldn't see any differences. However, in a SMP situation, is we reordered 1 and 2 line 3 could see changed but not the state change. For example, if CPU1 ran line 2 (but not 1) and then CPU2 ran lines 3 and 4, CPU2 would be running with the old state and if it then cleared changed, the change that CPU1 just made would get lost.

A barrier tells the system that at some point, between 1 and 2 it must make things consistent before moving on.

Do a search on 'memory barrier', you'll find some good posts: Memory Barriers Are Like Source Control Operations

like image 82
Paul Rubel Avatar answered Nov 03 '22 06:11

Paul Rubel


Memory barriers are required because current CPUs perform a lot of out-of-order executions: they load many instructions at a time and perform in a non-deterministic order them, if there are not dependencies among them.

In order to avoid reordering due to compiler optimization the volatile keyword is sufficient (speaking of C++ here). So a synchronization primitive (e.g. lock) is implemented by both properly using volatile and some kind of assembler fence instruction (there are many of them, more or less strong: see section 7.5.5 in http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html)

you know what a lock is?

x = 0;

thread 1:                           thread 2:

a.lock();                           a.lock();
x++;                                x++;
a.unlock();                         a.unlock();

x will result being correctly 2. Now suppose that there is no guarantee in the order of execution of the instructions of these two threads. What if the executed instruction are (a and x are independent, so out-of-order execution would be allowed, if lock() wasn't properly implemented with memory barriers):

x = 0;

thread 1:                           thread 2:

x++;                                x++;
a.lock();                           a.lock();
a.unlock();                         a.unlock();

x can result being equal to 2 or to 1.

like image 23
Sigi Avatar answered Nov 03 '22 05:11

Sigi