Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do compilers duplicate some instructions?

Sometimes compilers generate code with weird instruction duplications that can safely be removed. Consider the following piece of code:

int gcd(unsigned x, unsigned y) {   return x == 0 ? y : gcd(y % x, x); } 

Here is the assembly code (generated by clang 5.0 with optimizations enabled):

gcd(unsigned int, unsigned int): # @gcd(unsigned int, unsigned int)   mov eax, esi   mov edx, edi   test edx, edx   je .LBB0_1 .LBB0_2: # =>This Inner Loop Header: Depth=1   mov ecx, edx   xor edx, edx   div ecx   test edx, edx   mov eax, ecx   jne .LBB0_2   mov eax, ecx   ret .LBB0_1:   ret 

In the following snippet:

  mov eax, ecx   jne .LBB0_2   mov eax, ecx 

If the jump doesn't happen, eax is reassigned for no obvious reason.

The other example is two ret's at the end of the function: one would perfectly work as well.

Is the compiler simply not intelligent enough or there's a reason to not remove the duplications?

like image 355
Ignat Loskutov Avatar asked Dec 28 '17 20:12

Ignat Loskutov


1 Answers

Compilers can perform optimisations that are not obvious to people and removing instructions does not always make things faster.

A small amount of searching shows that various AMD processors have branch prediction problems when a RET is immediately after a conditional branch. By filling that slot with what is essentially a no-op, the performance problem is avoided.

Update:

Example reference, section 6.2 of the "Software Optimization Guide for AMD64 Processors" (see http://support.amd.com/TechDocs/25112.PDF) says:

Specifically, avoid the following two situations:

  • Any kind of branch (either conditional or unconditional) that has the single-byte near-return RET instruction as its target. See “Examples.”

  • A conditional branch that occurs in the code directly before the single-byte near-return RET instruction.

It also goes into detail on why jump targets should have alignment which is also likely to explain the duplicate RETs at the end of the function.

like image 171
janm Avatar answered Sep 23 '22 07:09

janm