Robert Love says that "set_task_state (task, state) sets the given task to the given state. If applicable, it also provides a memory barrier to force ordering on other processors (This is only needed on SMP systems) Otherwise it is equivalent to task->state = state
My question is: How a memory barrier can force ordering on other processors?
What does robert love mean by this - Why is this required? What is this ordering he might be talking about? Is he talking of scheduling queues here?
If so, does every processor in SMP have a different scheduling queue? I am confused
Memory barriers are typically used when implementing low-level machine code that operates on memory shared by multiple devices. Such code includes synchronization primitives and lock-free data structures on multiprocessor systems, and device drivers that communicate with computer hardware.
Barrier is software that mimics the functionality of a KVM switch, which historically would allow you to use a single keyboard and mouse to control multiple computers by physically turning a dial on the box to switch the machine you're controlling at any given moment.
Data Memory Barrier (DMB) prevents reordering of data accesses instructions across the DMB instruction.
Description. smp_mb() Similar to mb(), but only guarantees ordering between cores/processors within an SMP system. All memory accesses before the smp_mb() will be visible to all cores within the SMP system before any accesses after the smp_mb().
Your CPU, to squeeze out extra performance, does Out of Order Execution, which can run operations in a different order than they are given in the code. An optimizing compiler can change the order of operations to make code faster. Compiler writers/kernel types have to take care not to change expectations (or at least conform to the spec so they can say your expectation isn't right)
Here's an example
1: CPU1: task->state = someModifiedStuff
2: CPU1: changed = 1;
3: CPU2: if (changed)
4: CPU2: ...
If we didn't have a barrier for setting state we could reorder 1 and 2. Since neither references the other a single-threaded implementation wouldn't see any differences. However, in a SMP situation, is we reordered 1 and 2 line 3 could see changed but not the state change. For example, if CPU1 ran line 2 (but not 1) and then CPU2 ran lines 3 and 4, CPU2 would be running with the old state and if it then cleared changed, the change that CPU1 just made would get lost.
A barrier tells the system that at some point, between 1 and 2 it must make things consistent before moving on.
Do a search on 'memory barrier', you'll find some good posts: Memory Barriers Are Like Source Control Operations
Memory barriers are required because current CPUs perform a lot of out-of-order executions: they load many instructions at a time and perform in a non-deterministic order them, if there are not dependencies among them.
In order to avoid reordering due to compiler optimization the volatile
keyword is sufficient (speaking of C++ here). So a synchronization primitive (e.g. lock
) is implemented by both properly using volatile
and some kind of assembler fence
instruction (there are many of them, more or less strong: see section 7.5.5 in http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html)
you know what a lock is?
x = 0;
thread 1: thread 2:
a.lock(); a.lock();
x++; x++;
a.unlock(); a.unlock();
x
will result being correctly 2
. Now suppose that there is no guarantee in the order of execution of the instructions of these two threads. What if the executed instruction are (a and x are independent, so out-of-order execution would be allowed, if lock()
wasn't properly implemented with memory barriers):
x = 0;
thread 1: thread 2:
x++; x++;
a.lock(); a.lock();
a.unlock(); a.unlock();
x
can result being equal to 2
or to 1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With