When I compile and run this code
#include <stdio.h>
int collatz(int a) {
return a == 1 ? 1 : collatz((a%2) ? 3*a+1 : a/2);
}
int main() {
for (int a = 0; a < 10; ++a)
printf("%d: %d\n", a, collatz(a));
}
in clang
(10.0.0-4ubuntu1) it crashes, but when I add -O3
it runs fine. (This holds both when passing this as C or C++ code to clang
.)
I understand that it crashes without the optimization flag, since the call of collatz
with 0
will lead to an infinite recursion. However, with -O3
it returns 0: 1
, which is wrong. Is that a (known) bug in the optimization or am I missing something?
(gcc
/g++
(9.3.0) also crash without optimization, and the compilation hangs with -O3
.)
Edit: Just to add that I am not looking for the reason why the program hangs / fails (that is clear) but why the compiler apparently has the liberty to return some value whilst the program should not return (but enter an infinite loop).
Edit 2: Why haven't I picked a more minimal example? Well, here it is, this gives the same behavior
int divide(int i) {
return i == 1 ? 1 : divide(i/2);
}
The most-common cause of stack overflow is excessively deep or infinite recursion, in which a function calls itself so many times that the space needed to store the variables and information associated with each call is more than can fit on the stack.
By default, Fuchsia C/C++ code compiles with the flag -Wconversion , which prohibits implicit type conversions that may alter a value.
-Werror= Make the specified warning into an error. The specifier for a warning is appended; for example -Werror=switch turns the warnings controlled by -Wswitch into errors.
For GCC, copying from the full list of warnings provided by this tool for your compiler version appears to be the only way to ensure that all warnings are turned on, since (unlike Clang) GCC does not provide -Weverything . The tool appears to parse the actual c.
Is that a (known) bug in the optimization or am I missing something?
It's not a bug in optimisation. It's a bug in the program that is being compiled. The behaviour of the program is undefined in C++.
it returns 0: 1, which is wrong
There is no "wrong" behaviour when the behaviour of the program is undefined.
but the compilation hangs with -O3
This is probably a compiler bug.
how can the compiler just decide to return something, whereas, if you follow the program logic, it should not.
When the behaviour of the program is undefined, the compiler can decide to do anything or to do nothing. There's nothing in particular that it should do or should not do. An undefined program has no logic to it.
why the behaviour is undefined
The C++ standard says:
[intro.progress]
The implementation may assume that any thread will eventually do one of the following:
- terminate,
- make a call to a library I/O function,
- perform an access through a volatile glvalue, or
- perform a synchronization operation or an atomic operation.
In case the argument is 0, the thread would never do any of those things, and as such the [language] implementation is allowed to assume that the function is never called with the argument of 0. When you call the function with such argument, you contradict this assumption and behaviour is undefined.
Note: my answer is only true in C. As said in other answers in C++ the program has UB in it. This is one of the reason you shouldn't compile C code with a C++ compiler.
I tried to get the assembly generated by clang 10.0.0 through godbolt. I understand not everyone knows how to read assembly so I'll try to explain succintly what we can deduce from the listings. Here's what I generated:
With clang -S main.c
:
collatz(int):
push {r11, lr}
mov r11, sp
sub sp, sp, #16
str r0, [r11, #-4]
ldr r0, [r11, #-4]
cmp r0, #1
bne .LBB0_2
b .LBB0_1
.LBB0_1:
mov r0, #1
str r0, [sp, #8] @ 4-byte Spill
b .LBB0_6
.LBB0_2:
ldr r0, [r11, #-4]
add r1, r0, r0, lsr #31
bic r1, r1, #1
sub r0, r0, r1
cmp r0, #0
beq .LBB0_4
b .LBB0_3
.LBB0_3:
ldr r0, [r11, #-4]
add r0, r0, r0, lsl #1
add r0, r0, #1
str r0, [sp, #4] @ 4-byte Spill
b .LBB0_5
.LBB0_4:
ldr r0, [r11, #-4]
add r0, r0, r0, lsr #31
asr r0, r0, #1
str r0, [sp, #4] @ 4-byte Spill
b .LBB0_5
.LBB0_5:
ldr r0, [sp, #4] @ 4-byte Reload
bl collatz(int)
str r0, [sp, #8] @ 4-byte Spill
b .LBB0_6
.LBB0_6:
ldr r0, [sp, #8] @ 4-byte Reload
mov sp, r11
pop {r11, lr}
bx lr
main:
push {r11, lr}
mov r11, sp
sub sp, sp, #16
mov r0, #0
str r0, [r11, #-4]
str r0, [sp, #8]
b .LBB1_1
.LBB1_1: @ =>This Inner Loop Header: Depth=1
ldr r0, [sp, #8]
cmp r0, #9
bgt .LBB1_4
b .LBB1_2
.LBB1_2: @ in Loop: Header=BB1_1 Depth=1
ldr r0, [sp, #8]
str r0, [sp, #4] @ 4-byte Spill
bl collatz(int)
ldr r1, .LCPI1_0
str r0, [sp] @ 4-byte Spill
mov r0, r1
ldr r1, [sp, #4] @ 4-byte Reload
ldr r2, [sp] @ 4-byte Reload
bl printf
b .LBB1_3
.LBB1_3: @ in Loop: Header=BB1_1 Depth=1
ldr r0, [sp, #8]
add r0, r0, #1
str r0, [sp, #8]
b .LBB1_1
.LBB1_4:
ldr r0, [r11, #-4]
mov sp, r11
pop {r11, lr}
bx lr
.LCPI1_0:
.long .L.str
.L.str:
.asciz "%d: %d\n"
Here, things look pretty normal, the function is defined, its behavior looks okay, the loop is formed correctly in the main so as we can expect, when it calls collatz(0), the stack overflows.
With clang -S -O3 main.c
:
collatz(int):
mov r0, #1
bx lr
main:
push {r4, r10, r11, lr}
add r11, sp, #8
ldr r4, .LCPI1_0
mov r1, #0
mov r2, #1
mov r0, r4
bl printf
mov r0, r4
mov r1, #1
mov r2, #1
bl printf
mov r0, r4
mov r1, #2
mov r2, #1
bl printf
mov r0, r4
mov r1, #3
mov r2, #1
bl printf
mov r0, r4
mov r1, #4
mov r2, #1
bl printf
mov r0, r4
mov r1, #5
mov r2, #1
bl printf
mov r0, r4
mov r1, #6
mov r2, #1
bl printf
mov r0, r4
mov r1, #7
mov r2, #1
bl printf
mov r0, r4
mov r1, #8
mov r2, #1
bl printf
mov r0, r4
mov r1, #9
mov r2, #1
bl printf
mov r0, #0
pop {r4, r10, r11, lr}
bx lr
.LCPI1_0:
.long .L.str
.L.str:
.asciz "%d: %d\n"
clang is completely drunk here. It did optimize the loop as we would expect but it decided that collatz(0) = 1 for whatever reason and it compiled a very weird version of collatz only to never call it. You may say clang is a C++ compiler so it makes sense for it to do unexpected things but that's not true, clang advertise itself as "the Clang C, C++, and Objective-C compiler". To me, it's a compiler bug. Let's try with clang 12.0.1 just in case the bug was solved in a later version.
With the latest version clang -S -O3 main.c
:
.text
.file "main.c"
.globl collatz # -- Begin function collatz
.p2align 4, 0x90
.type collatz,@function
collatz: # @collatz
.cfi_startproc
# %bb.0:
# kill: def $edi killed $edi def $rdi
cmpl $1, %edi
jne .LBB0_2
jmp .LBB0_5
.p2align 4, 0x90
.LBB0_3: # in Loop: Header=BB0_2 Depth=1
leal (%rdi,%rdi,2), %edi
addl $1, %edi
cmpl $1, %edi
je .LBB0_5
.LBB0_2: # =>This Inner Loop Header: Depth=1
testb $1, %dil
jne .LBB0_3
# %bb.4: # in Loop: Header=BB0_2 Depth=1
movl %edi, %eax
shrl $31, %eax
addl %edi, %eax
sarl %eax
movl %eax, %edi
cmpl $1, %edi
jne .LBB0_2
.LBB0_5:
movl $1, %eax
retq
.Lfunc_end0:
.size collatz, .Lfunc_end0-collatz
.cfi_endproc
# -- End function
.globl main # -- Begin function main
.p2align 4, 0x90
.type main,@function
main: # @main
.cfi_startproc
# %bb.0:
.p2align 4, 0x90
.LBB1_1: # =>This Inner Loop Header: Depth=1
jmp .LBB1_1
.Lfunc_end1:
.size main, .Lfunc_end1-main
.cfi_endproc
# -- End function
.ident "clang version 12.0.1"
.section ".note.GNU-stack","",@progbits
.addrsig
Here, the main function consists of a while(1) loop, which is what we expect given the input.
To conclude, I think both gcc and clang 10.0.0 have a bug, though one is easier to see as a bug than the other
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With