While compiling a larger project with clang I stumbled upon an irritating bug.
Consider the following small example:
unsigned long int * * fee();
void foo( unsigned long int q )
{
unsigned long int i,j,k,e;
unsigned long int pows[7];
unsigned long int * * table;
e = 0;
for (i = 1; i <= 256; i *= q)
pows[e++] = i;
pows[e--] = i;
table = fee(); // need to set table to something unknown
// here, otherwise the compiler optimises
// parts of the loops below away
// (and no bug occurs)
for (i = 0; i < q; i++)
for (j = 0; j < e; j++)
((unsigned char*)(*table) + 5 )[i*e + j] = 0; // bug here
}
To the best of my knowledge this code does not violate the C standard in any way, although the last line seems awkward (in the actual project, code like this appears due to excessive use of preprocessor macros).
Compiling this with clang (version 3.1 or higher) at optimisation level -O1 or higher results in code writing to the wrong position in memory.
The crucial parts of the assembly file produced by clang/LLVM read as follows: (This is GAS syntax, so to those of you who are used to Intel: Beware!)
[...]
callq _fee
leaq 6(%rbx), %r8 ## at this point, %rbx == e-1
xorl %edx, %edx
LBB0_4:
[...]
movq %r8, %rsi
imulq %rdx, %rsi
incq %rdx
LBB0_6:
movq (%rax), %rcx ## %rax == fee()
movb $0, (%rcx,%rsi)
incq %rsi
[conditional jumps back to LBB0_6 resp. LBB0_4]
[...]
In other words, the instructions do
(*table)[i*(e+5) + j] = 0;
instead of the last line written above. The choice of + 5
is arbitrary, adding (or subtracting) other integers results in the same behaviour. So - is this a bug in LLVM's optimisation or is there undefined behaviour going on here?
Edit: Note also that the bug disappears if I leave out the cast (unsigned char*)
in the last line. In general, the bug appears to be quite sensitive to any changes.
LLVM currently does not optimize well loads and stores of large aggregate types (i.e. structs and arrays). As an alternative, consider loading individual fields from memory. Aggregates that are smaller than the largest (performant) load or store instruction supported by the targeted hardware are well supported.
DESCRIPTION. The opt command is the modular LLVM optimizer and analyzer. It takes LLVM source files as input, runs the specified optimizations or analyses on it, and then outputs the optimized file.
DESCRIPTION. The llvm-gcc command is the LLVM C front end. It is a modified version of gcc that compiles C/ObjC programs into native objects, LLVM bitcode or LLVM assembly language, depending upon the options. By default, llvm-gcc compiles to native objects just like GCC does.
LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture.
I am quite sure this is an optimizer bug. It repro's in LLVM-2.7 and LLVM-3.1, the only versions I have access to.
I posted a bug to the LLVM Bugzilla.
The bug is demonstrated by this SSCCE:
#include <stdio.h>
unsigned long int * table;
void foo( unsigned long int q )
{
unsigned long int i,j,e;
e = 0;
for (i = 1; i <= 256; i *= q)
e++;
e--;
for (i = 0; i < q; i++)
for (j = 0; j < e; j++)
((unsigned char*)(table) + 13 )[i*e + j] = 0; // bug here
}
int main() {
unsigned long int v[8];
int i;
memset(v, 1, sizeof(v));
table = v;
foo(2);
for(i=0; i<sizeof(v); i++) {
printf("%d", ((unsigned char*)v)[i]);
}
puts("");
return 0;
}
It should print
1111111111111000000000000000011111111111111111111111111111111111
under GCC and "clang -O0". The incorrect output observed with LLVM is
0000000011111111111110000000011111111111111111111111111111111111
Thanks for noticing this!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With