LOCK prefix of Intel instruction. What is the point?

Question

I read the Intel manual and found there is a lock prefix for instructions, which can prevent processors writing to the same memory location at the same time. I am quite excited about it. I guess it could be used as hardware mutex. So I wrote a piece of code to have a shot. The result is quite frustrating. The lock does not support MOV or LEA instructions. The manual says LOCK only supports ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. What is more, if the LOCK prefix is used with one of these instructions and the source operand is a memory operand, an undefined opcode exception (#UD) may be generated.

I wonder why so many limitations, so many restrictions make LOCK seem useless. I cannot use it to guarantee a general write operation not have dirty data or other problems caused by parallelism.

E.g. I wrote code ++(*p) in C. p is pointer to a shared memory. The corresponding assembly is like:

movl    28(%esp), %eax
movl    (%eax), %eax
leal    1(%eax), %edx
movl    28(%esp), %eax
movl    %edx, (%eax)

I added "lock" before "movl" and "leal", but the processor complains "Invalid Instruction". :-( I guess the only way to make the write operations serialized is to use software mutex, right?

NPE · Accepted Answer

I certainly would not call lock useless. lock cmpxchg is the standard way to perform compare-and-swap, which is the basic building block of many synchronization algorithms.

Also, see fetch-and-add.

Ignacio Vazquez-Abrams · Answer

The purpose of lock is to make operations atomic, not serialized. In this way the CPU cannot be preempted before the operation takes effect.

Jirka Hanika · Answer

The x86 processors are known for a hairy design with lots of features, lots of rules, and even more exceptions to all those rules. This is related to the long history to the family.

When compilers or people are using LOCK, they are always using it with all its limitations, often on data specially introduced to perform synchronization between threads, as opposed to application data that the algorithms eventually manipulate. One then adapts the thread synchronization protocols to what LOCK can do for them, rather than vice versa.

The general type of instruction you seem to look for is called memory barriers. Indeed, x86 has several "modern" instructions from this family (MFENCE, LFENCE, SFENCE). They are full fence, load fence, and store fence, respectively. However, their importance in the instruction set is limited to SSE, because Intel guarantees serialization of writes on the traditional part of the instruction set, and that is pretty much the reason why this aged architecture is quite an easy target for multithreaded programming.

LOCK prefix of Intel instruction. What is the point?

Tags:

c

linux

assembly

parallel-processing

intel

Sean

4 Answers

NPE

Ignacio Vazquez-Abrams

Jirka Hanika

ob_dev

Recent Activity

Donate For Us

LOCK prefix of Intel instruction. What is the point?

Tags:

c

linux

assembly

parallel-processing

intel

Sean

4 Answers

NPE

Ignacio Vazquez-Abrams

Jirka Hanika

ob_dev

Related questions

Recent Activity

Donate For Us