In the C++ example below, I define a function that computes the height of the leftmost path of a tree.
struct TreeNode {
int value{};
TreeNode *l = nullptr;
TreeNode *r = nullptr;
};
int getLeftHeight(TreeNode *root) {
int height = 0;
while (root != nullptr) {
root = root->l;
height++;
}
return height;
}
When compiling with GCC 9.3 or 10.1 (using -O3), I get the following x86.
getLeftHeight(TreeNode*):
xor eax, eax
test rdi, rdi
je .L4
.L3:
mov rdi, QWORD PTR [rdi+8]
add eax, 1
test rdi, rdi
jne .L3
ret
.L4:
ret
However, when compiling using Clang 10, as shown below, there is not a repeated ret
.
getLeftHeight(TreeNode*):
xor eax, eax
test rdi, rdi
je .LBB0_3
.LBB0_1:
mov rdi, qword ptr [rdi + 8]
add eax, 1
test rdi, rdi
jne .LBB0_1
.LBB0_3:
ret
Why is GCC doing this?
As far as I understand, the ret
will take one extra byte to be encoded.
Maybe this doesn't matter in this case because of alignment/padding of the emitted machine code, but if this is not the case and this duplication happens all over the place in a large codebase, many bytes of machine code could end up being just duplicate ret
s.
This particular case of duplication also seems fairly simple to spot and optimize away.
This seems to be a missed optimization in GCC.
There is a bug #71923 with a very similar example tagged "missed-optimization".
Having said that, the impact on performance is marginal - mainly on code size. In addition, when the second ret
is aligned (which will often be the case with modern x86 targets), eliminating the first ret
would often have no impact on code size, but if eliminated, falling through from the loop would incur decoding the additional nop
instruction before ret
, which can make that code path slower.
Anyway, a pull request to GCC is always welcome. You can see the list of optimization passes with -fdump-passes
and dump the RTL/GIMPLE tree using -fdump-tree-...
flags (or see it on Godbolt). The main question is if an existing optimization pass is supposed to take care of this or if a new one needs to be added (I'm not an expert on GCC, so can't advise one way or another).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With