I saw some x86 assembly in Qt's source:
q_atomic_increment: movl 4(%esp), %ecx lock incl (%ecx) mov $0,%eax setne %al ret .align 4,0x90 .type q_atomic_increment,@function .size q_atomic_increment,.-q_atomic_increment
From Googling, I knew lock
instruction will cause CPU to lock the bus, but I don't know when CPU frees the bus?
About the whole above code, I don't understand how this code implements the Add
?
Causes the processor's LOCK# signal to be asserted during execution of the accompanying instruction (turns the instruction into an atomic instruction). In a multiprocessor environment, the LOCK# signal insures that the processor has exclusive use of any shared memory while the signal is asserted.
The CALL instruction performs two operations: It pushes the return address (address immediately after the CALL instruction) on the stack. It changes EIP to the call destination. This effectively transfers control to the call target and begins execution there.
x86 guarantees that aligned loads and stores up to 64 bits are atomic, but not wider accesses.
LOCK
is not an instruction itself: it is an instruction prefix, which applies to the following instruction. That instruction must be something that does a read-modify-write on memory (INC
, XCHG
, CMPXCHG
etc.) --- in this case it is the incl (%ecx)
instruction which inc
rements the l
ong word at the address held in the ecx
register.
The LOCK
prefix ensures that the CPU has exclusive ownership of the appropriate cache line for the duration of the operation, and provides certain additional ordering guarantees. This may be achieved by asserting a bus lock, but the CPU will avoid this where possible. If the bus is locked then it is only for the duration of the locked instruction.
This code copies the address of the variable to be incremented off the stack into the ecx
register, then it does lock incl (%ecx)
to atomically increment that variable by 1. The next two instructions set the eax
register (which holds the return value from the function) to 0 if the new value of the variable is 0, and 1 otherwise. The operation is an increment, not an add (hence the name).
What you may be failing to understand is that the microcode required to increment a value requires that we read in the old value first.
The Lock keyword forces the multiple micro instructions that are actually occuring to appear to operate atomically.
If you had 2 threads each trying to increment the same variable, and they both read the same original value at the same time then they both increment to the same value, and they both write out the same value.
Instead of having the variable incremented twice, which is the typical expectation, you end up incrementing the variable once.
The lock keyword prevents this from happening.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With