I tried to find out the speed difference between plain loops, loop
loops and builtin rep
loops. I wrote three programs to compare the behavior:
_start: xor %ecx,%ecx
0: not %ecx
dec %ecx
jnz 0b
mov $1,%eax
xor %ebx,%ebx
int $0x80 # syscall 1: exit
_start: xor %ecx,%ecx
not %ecx
loop .
mov $1,%eax
xor %ebx,%ebx
int $0x80
_start: xor %ecx,%ecx
not %ecx
rep nop # Do nothing but decrement ecx
mov $1,%eax
xor %ebx,%ebx
int $0x80
It turned out the third program doesn't work as expected, and some recherche tells me, that rep nop
aka pause
does something completely unrelated.
What are the rep
, repz
and repnz
prefixes doing, when the instruction following them is not a string instruction?
REP is a prefix written before one of the string instructions. It is used for repeating an instruction count number of times, where count is stored in the CX register. After every operation the CX register is decremented and the zero flag is tested; the process continues till CX = 0.
The rep, repe, and repne prefixes are added to string instructions to make them repeat %ecx times (%cx times if the current address size is 16-bits).
Description. Use the rep (repeat while equal), repnz (repeat while nonzero) or repz (repeat while zero) prefixes in conjunction with string operations. Each prefix causes the associated string instruction to repeat until the count register (CX) or the zero flag (ZF) matches a tested condition.
REP/REPE/REPZ/REPNE /REPNZ--Repeat String Operation Prefix The REP (repeat), REPE (repeat while equal), REPNE (repeat while not equal), REPZ (repeat while zero), and REPNZ (repeat while not zero) mnemonics are prefixes that can be added to one of the string instructions.
It depends. rep ret
is sometimes used to avoid bad performance of jumping directly to a ret
on certain AMD processors. The rep
(F3) and repne
(F2) prefixes are also used as Mandatory Prefix for many SSE instructions (for example they change packed-single variants to scalar-singe or scalar-double variants). pause
(spin lock hint) is an alias of rep nop
. Some other new instructions use a "fake rep prefix" as well (popcnt
, crc32
, vmxon
, etc). The "fake" or Mandatory Prefix comes before the optional REX prefix, so it can't be said to be part of the opcode, it really is a prefix.
Other operations generate an #UD if prefixed with a rep
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With