Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gcc optimization skips initializing allocated memory

Using gcc 4.9.2 20150304 64 bit I bumped into this apparently strange behavior:

double doit() {
    double *ptr = (double *)malloc(sizeof(double));
    ptr[0] = 3.14;
    return (double)((uintptr_t) ptr);
}

In the code I'm allocating a double on the heap, initializing it and then returning another double initialized with the address of the first one converted to an intptr_t. This, with optimization -O2, generates in 32-bit mode the following assembly code:

sub    $0x28,%esp
push   $0x8                   ;; 8 bytes requested
call   8048300 <malloc@plt>   ;; malloc 'em
movl   $0x0,0x14(%esp)        ;; store zeros in upper 32bits
mov    %eax,0x10(%esp)        ;; store address in lower 32bits
fildll 0x10(%esp)             ;; convert a long long to double
add    $0x2c,%esp
ret    

and amazingly enough the initialization of the allocated double is completely gone.

When generating code with -O0 everything works as expected and the relevant code is instead:

push   %ebp
mov    %esp,%ebp
sub    $0x28,%esp
sub    $0xc,%esp
push   $0x8                    ;; 8 bytes requested
call   8048300 <malloc@plt>    ;; malloc 'em
add    $0x10,%esp
mov    %eax,-0xc(%ebp)
mov    -0xc(%ebp),%eax
fldl   0x8048578               ;; load 3.14 constant
fstpl  (%eax)                  ;; store in allocated memory
mov    -0xc(%ebp),%eax
mov    %eax,-0x28(%ebp)        ;; store address in low 32 bits
movl   $0x0,-0x24(%ebp)        ;; store 0 in high 32 bits
fildll -0x28(%ebp)             ;; convert the long-long to a double
fstpl  -0x20(%ebp)
fldl   -0x20(%ebp)
leave  
ret    

Question

Did I do anything invalid (I'm thinking specifically to aliasing rules, even if it seems to me that skipping the initialization has no justification) or is this just a gcc bug?

Note that the very same problem is present when compiling to 64 bit code (formally intptr_t in 64-bit mode is 8 bytes and therefore ad double could not be able to represent it exactly... this doesn't happen however because on x86-64 only 48 of the 64 bits of addresses are used and a double can represent all those values exactly).

like image 507
6502 Avatar asked May 09 '15 08:05

6502


People also ask

How do I enable optimization in GCC?

GCC has a range of optimization levels, plus individual options to enable or disable particular optimizations. The overall compiler optimization level is controlled by the command line option -On, where n is the required optimization level, as follows: -O0 . (default).

What is O3 optimization?

Optimization level -O3 -O3 instructs the compiler to optimize for the performance of generated code and disregard the size of the generated code, which might result in an increased code size. It also degrades the debug experience compared to -O2 .

Why should code optimization be controlled by a compile time flag?

Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.

What optimization does GCC do?

The compiler optimizes to reduce the size of the binary instead of execution speed. If you do not specify an optimization option, gcc attempts to reduce the compilation time and to make debugging always yield the result expected from reading the source code.


3 Answers

Optimization is allowed to remove code in case of UB, but here it should not.

You have an unnecessary cast in Value *ptr = (Value *)malloc(sizeof(Value)); but this should be harmless.

This line res.d = (unsigned long long) ptr; should better be res.d = (intptr_t) ptr;, because intptr_t is explictely allowed to receive pointers, and you can then set an integral value in a double variable : you can lose precision, but it should not be UB.

I cannot test it (because I have not a gcc 4.9) but if you have same problem with this :

#include <stdint.h>

...

Value doit() {
    Value *ptr = malloc(sizeof(Value));
    ptr[0].u = 7;
    Value res; res.d = (double) ((intptr_t) ptr);
    return res;
}

I would conclude to a gcc bug.

I could try to compile the simplified version of the code with clang version 3.4.1 on FreeBSD 10.1

cc -O3 -S doit.c gives (striped down to code part) :

doit:                                   # @doit
# BB#0:
    pushl   %ebp
    movl    %esp, %ebp
    andl    $-8, %esp
    subl    $16, %esp
    movl    $8, (%esp)
    calll   malloc
    movl    $1074339512, 4(%eax)    # imm = 0x40091EB8
    movl    $1374389535, (%eax)     # imm = 0x51EB851F
    movl    %eax, 8(%esp)
    movl    $0, 12(%esp)
    fildll  8(%esp)
    movl    %ebp, %esp
    popl    %ebp
    ret

It is not same compilation than what gcc does, but clang does the 3.14 initialization even at -O3 optimization level (dump hex for 3.14 is 0x40091eb851eb851f)


After reading other comments and answers, I think that the real cause of the problem is that gcc skips the intermediate cast and reads return (double)((uintptr_t) ptr); as return (double) ptr; - well not exactly because it would then be a syntax error but still consideres there is UB since at the end a pointer value ends into a double variable. But if we decompose the line with the intermediate cast it should be read (IMHO) as :

register intptr_t intermediate = (intptr_t) ptr; // valid conversion
return (double) intermediate;  // valid conversion
like image 152
Serge Ballesta Avatar answered Sep 18 '22 22:09

Serge Ballesta


I see nothing strange here. You never read that 7 you written, instead you write result of malloc to a double:

Value *ptr = (Value*) malloc(sizeof(Value));
ptr[0].u = 7;
Value res; res.d = (uintptr_t) ptr; // ptr is a result of malloc
return res;  // ptr is lost here which probably makes 
             // GCC think that it is no longer accessible
             // so "7" is lost here too

And converting a pointer to a double will most likely lose a precision and thus making memory inaccessible (UB).

However, if you save your pointer to an integer (.u), GCC will treat that as aliased memory and keeps the initialization:

Value res; res.u = (uintptr_t) ptr; // Saving to .u, not .d

compiles to

0x0000000000400570 <+0>:     sub    $0x8,%rsp
0x0000000000400574 <+4>:     mov    $0x8,%edi
0x0000000000400579 <+9>:     callq  0x400460 <malloc@plt>
0x000000000040057e <+14>:    movq   $0x7,(%rax)
0x0000000000400585 <+21>:    add    $0x8,%rsp
0x0000000000400589 <+25>:    retq   

So the problem is that you saving pointer to a double.


BTW, (double)ptr is a compile error, as standard requires:

6.5.4 Cast operators

[...]

4 A pointer type shall not be converted to any floating type. A floating type shall not be converted to any pointer type.

As of N1548 draft

like image 45
myaut Avatar answered Sep 19 '22 22:09

myaut


It seems a bug... even with the simplified code

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

double doit() {
    double *ptr = (double *)malloc(sizeof(double));
    ptr[0] = 3.14;
    uintptr_t ip = (uintptr_t)ptr;
    return (double)ip;
}

int main(int argc, const char *argv[]) {
    double v = doit();
    double *p = (double *)((intptr_t)v);
    printf("sizeof(uintptr_t) = %i\n", (int)sizeof(uintptr_t));
    printf("*p = %0.3f\n", *p);
    return 0;
}

when compiled with -O2 doesn't initialize the memory.

The code works correctly returning directly an intptr_t (or an unsigned long long); but returning it after converting to a double doesn't work as gcc apparently assumes that in this case you won't be able to access the memory any more.

This is clearly false in 32-bit mode (where intptr_t is 4 bytes and double provides 53 bits of accuracy for integers) but also for 64-bit mode where while uintptr_t is indeed 8 bytes used values are 48 bits).

EDIT

Not sure about this but the problem could be related to the "dead code elimination on tree" (-ftree-dce). When compiling in 32-bit mode enabling optimizations -O2 but disabling this specific one with -fno-tree-dce the program output changes and is correct but the generated code is not.

More specifically the non-inlined version of doit contains no initialization code but the code generated in main inlines the call and the optimizer "knows" that the value of the memory is 3.14 and prints that directly in output.

EDIT 2

Confirmed as a bug, already corrected in trunk.

Workaround until next release is -fno-tree-pta

like image 42
6502 Avatar answered Sep 22 '22 22:09

6502