Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what would be the benefit of moving a register to itself in x86-64

I'm doing a project in x86-64 NASM and came across the instruction:

mov rdi, rdi

in the output of a compiler my professor wrote.

I have searched all over but can't find mention of why this would be needed. Does it affect the flags or is it something clever that I don't understand?

To give some context it's present in a loop right before the same register is decremented with sub.

like image 923
nrmad Avatar asked Mar 14 '19 18:03

nrmad


1 Answers

The instruction mov rdi, rdi is just an inefficient 3 byte NOP, equivalent to an actual NOP instruction. Assembling it, it generates the byte combination

48 89 ff       mov rdi, rdi

That can be considered as a NOP because it does neither affect the flags nor the registers. The only architectural effect is to advance the program counter to the next instruction.

It's common to use (multi-byte) NOPs to align the next instruction to a certain address, a popular example being an aligned jump target, especially at the top of a loop.

But in this case, it appears it's just an artifact of code-generation from a non-optimizing compiler, not being used for intentional padding.


It's inefficient compared to a true nop because it won't be special-cased to run more cheaply. (Its microarchitectural effect is different on current CPUs). It adds a cycle of latency to the dependency chain through RDI, and uses an ALU execution unit. (Neither Intel nor AMD CPUs can "eliminate" mov same,same and run it with zero latency in the register-rename stage, only between different architectural registers. mov rax,rdi for example can be about as cheap as a nop on IvyBridge+ and Ryzen, if you don't mind clobbering RAX.)

In your case, you should just remove it (instead of replacing it with 66 66 90 (short NOP with redundant operand-size prefixes) or 01 1F 00 (long NOP), because it's not being used for padding.


32-bit mov on x86-64 is never a NOP

If a search took you to this Q&A but you have an instruction like mov edi, edi in 64-bit code, that's unrelated. You're actually looking for any of the following Q&As:

  • Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?
  • Is mov %esi, %esi a no-op or not on x86-64?
  • MSVC compiler generates mov ecx, ecx that looks useless

It's not rare to find instructions doing this at the start of a function that takes an int arg and uses it as an array index, even in optimized compiler output from mainstream compilers.

 mov  edi, edi           ; zero-extend EDI into RDI

It would be more efficient to pick a different destination register to allow mov-elimination to work on modern Intel and AMD CPUs, like mov eax, edi, but compilers often don't do this.

like image 121
zx485 Avatar answered Sep 22 '22 13:09

zx485