Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

x86, C++, gcc and memory alignment

I have this simple C++ code:

int testFunction(int* input, long length) {
    int sum = 0;
    for (long i = 0; i < length; ++i) {
        sum += input[i];
    }
    return sum;
}


#include <stdlib.h>
#include <iostream>
using namespace std;
int main()
{
    union{
        int* input;
        char* cinput;
    };

    size_t length = 1024;
    input = new int[length];


    //cinput++;

    cout<<testFunction(input, length-1);

}

If I compile it with g++ 4.9.2 with -O3, it runs fine. I expected that if I uncomment the penultimate line it would run slower, however it outright crashes with SIGSEGV.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000400754 in main ()
(gdb) disassemble 
Dump of assembler code for function main:
   0x00000000004006e0 <+0>:     sub    $0x8,%rsp
   0x00000000004006e4 <+4>:     movabs $0x100000000,%rdi
   0x00000000004006ee <+14>:    callq  0x400690 <_Znam@plt>
   0x00000000004006f3 <+19>:    lea    0x1(%rax),%rdx
   0x00000000004006f7 <+23>:    and    $0xf,%edx
   0x00000000004006fa <+26>:    shr    $0x2,%rdx
   0x00000000004006fe <+30>:    neg    %rdx
   0x0000000000400701 <+33>:    and    $0x3,%edx
   0x0000000000400704 <+36>:    je     0x4007cc <main+236>
   0x000000000040070a <+42>:    cmp    $0x1,%rdx
   0x000000000040070e <+46>:    mov    0x1(%rax),%esi
   0x0000000000400711 <+49>:    je     0x4007f1 <main+273>
   0x0000000000400717 <+55>:    add    0x5(%rax),%esi
   0x000000000040071a <+58>:    cmp    $0x3,%rdx
   0x000000000040071e <+62>:    jne    0x4007e1 <main+257>
   0x0000000000400724 <+68>:    add    0x9(%rax),%esi
   0x0000000000400727 <+71>:    mov    $0x3ffffffc,%r9d
   0x000000000040072d <+77>:    mov    $0x3,%edi
   0x0000000000400732 <+82>:    mov    $0x3fffffff,%r8d
   0x0000000000400738 <+88>:    sub    %rdx,%r8
   0x000000000040073b <+91>:    pxor   %xmm0,%xmm0
   0x000000000040073f <+95>:    lea    0x1(%rax,%rdx,4),%rcx
   0x0000000000400744 <+100>:   xor    %edx,%edx
   0x0000000000400746 <+102>:   nopw   %cs:0x0(%rax,%rax,1)
   0x0000000000400750 <+112>:   add    $0x1,%rdx
=> 0x0000000000400754 <+116>:   paddd  (%rcx),%xmm0
   0x0000000000400758 <+120>:   add    $0x10,%rcx
   0x000000000040075c <+124>:   cmp    $0xffffffe,%rdx
   0x0000000000400763 <+131>:   jbe    0x400750 <main+112>
   0x0000000000400765 <+133>:   movdqa %xmm0,%xmm1
   0x0000000000400769 <+137>:   lea    -0x3ffffffc(%r9),%rcx
---Type <return> to continue, or q <return> to quit---

Why does it crash? Is it a compiler bug? Am I causing some undefined behavior? Does the compiler expect that ints are always 4-byte-aligned?

I also tested it on clang and there's no crash.

Here's g++'s assembly output: http://pastebin.com/CJdCDCs4

like image 228
user697683 Avatar asked Dec 25 '22 01:12

user697683


1 Answers

The code input = new int[length]; cinput++; causes undefined behaviour because the second statement is reading from a union member that is not active.

Even ignoring that, testFunction(input, length-1) would again have undefined behaviour for the same reason.

Even ignoring that, the sum loop accesses an object through a glvalue of the wrong type, which has undefined behaviour.

Even ignoring that, reading from an uninitialized object, as your sum loop does, would again have undefined behaviour.

like image 194
Kerrek SB Avatar answered Dec 27 '22 15:12

Kerrek SB