Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Super weird segfault with gcc 4.7 -- Bug?

Tags:

c++

linux

gcc4.7

Here is a piece of code that I've been trying to compile:

#include <cstdio>

#define N 3

struct Data {
    int A[N][N];
    int B[N];
};

int foo(int uloc, const int A[N][N], const int B[N])
{
    for(unsigned int j = 0; j < N; j++) {
        for( int i = 0; i < N; i++) {
            for( int r = 0; r < N ; r++) {
                for( int q = 0; q < N ; q++) {
                   uloc += B[i]*A[r][j] + B[j];
                }
            }
        }
    }
    return uloc;
}

int apply(const Data *d)
{
    return foo(4,d->A,d->B);
}

int main(int, char **)
{
    Data d;
    for(int i = 0; i < N; ++i) {
        for(int j = 0; j < N; ++j) {
            d.A[i][j] = 0.0;
        }
        d.B[i] = 0.0;
    }

    int res = 11 + apply(&d);

    printf("%d\n",res);
    return 0;
}

Yes, it looks quite strange, and does not do anything useful at all at the moment, but it is the most concise version of a much larger program which I had the problem with initially.

It compiles and runs just fine with GCC(G++) 4.4 and 4.6, but if I use GCC 4.7, and enable third level optimizations:

g++-4.7 -g -O3 prog.cpp -o prog

I get a segmentation fault when running it. Gdb does not really give much information on what went wrong:

(gdb) run
Starting program: /home/kalle/work/code/advect_diff/c++/strunt 

Program received signal SIGSEGV, Segmentation fault.
apply (d=d@entry=0x7fffffffe1a0) at src/strunt.cpp:25
25      int apply(const Data *d)
(gdb) bt
#0  apply (d=d@entry=0x7fffffffe1a0) at src/strunt.cpp:25
#1  0x00000000004004cc in main () at src/strunt.cpp:34

I've tried tweaking the code in different ways to see if the error goes away. It seems necessary to have all of the four loop levels in foo, and I have not been able to reproduce it by having a single level of function calls. Oh yeah, the outermost loop must use an unsigned loop index.

I'm starting to suspect that this is a bug in the compiler or runtime, since it is specific to version 4.7 and I cannot see what memory accesses are invalid.

Any insight into what is going on would be very much appreciated.

It is possible to get the same situation with the C-version of GCC, with a slight modification of the code.

My system is:

Debian wheezy Linux 3.2.0-4-amd64 GCC 4.7.2-5


Okay so I looked at the disassembly offered by gdb, but I'm afraid it doesn't say much to me:

Dump of assembler code for function apply(Data const*):
   0x0000000000400760 <+0>: push   %r13
   0x0000000000400762 <+2>: movabs $0x400000000,%r8
   0x000000000040076c <+12>:    push   %r12
   0x000000000040076e <+14>:    push   %rbp
   0x000000000040076f <+15>:    push   %rbx
   0x0000000000400770 <+16>:    mov    0x24(%rdi),%ecx
=> 0x0000000000400773 <+19>:    mov    (%rdi,%r8,1),%ebp
   0x0000000000400777 <+23>:    mov    0x18(%rdi),%r10d
   0x000000000040077b <+27>:    mov    $0x4,%r8b
   0x000000000040077e <+30>:    mov    0x28(%rdi),%edx
   0x0000000000400781 <+33>:    mov    0x2c(%rdi),%eax
   0x0000000000400784 <+36>:    mov    %ecx,%ebx
   0x0000000000400786 <+38>:    mov    (%rdi,%r8,1),%r11d
   0x000000000040078a <+42>:    mov    0x1c(%rdi),%r9d
   0x000000000040078e <+46>:    imul   %ebp,%ebx
   0x0000000000400791 <+49>:    mov    $0x8,%r8b
   0x0000000000400794 <+52>:    mov    0x20(%rdi),%esi

What should I see when I look at this?


Edit 2015-08-13: This seem to be fixed in g++ 4.8 and later.

like image 536
kalj Avatar asked Jan 30 '14 15:01

kalj


2 Answers

You never initialized d. Its value is indeterminate, and trying to do math with its contents is undefined behavior. (Even trying to read its values without doing anything with them is undefined behavior.) Initialize d and see what happens.


Now that you've initialized d and it still fails, that looks like a real compiler bug. Try updating to 4.7.3 or 4.8.2; if the problem persists, submit a bug report. (The list of known bugs currently appears to be empty, or at least the link is going somewhere that only lists non-bugs.)

like image 88
user2357112 supports Monica Avatar answered Nov 10 '22 20:11

user2357112 supports Monica


It indeed and unfortunately is a bug in gcc. I have not the slightest idea what it is doing there, but the generated assembly for the apply function is ( I compiled it without main btw., and it has foo inlined in it):

_Z5applyPK4Data:
        pushq   %r13
        movabsq $17179869184, %r8
        pushq   %r12
        pushq   %rbp
        pushq   %rbx
        movl    36(%rdi), %ecx
        movl    (%rdi,%r8), %ebp
        movl    24(%rdi), %r10d

and exactly at the movl (%rdi,%r8), %ebp it will crashes, since it adds a nonsensical 0x400000000 to $rdi (the first parameter, thus the pointer to Data) and dereferences it.

like image 22
PlasmaHH Avatar answered Nov 10 '22 21:11

PlasmaHH