Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why/how does gcc compile the undefined behaviour in this signed-overflow test so it works on x86 but not ARM64?

I was self-studying CSAPP and got a strange result when I ran into a strange issue during the run of a assertion test.

I'm not sure what to start this question with, so let me get the code first (file name visible in comments):

// File: 2.30.c
// Author: iBug

int tadd_ok(int x, int y) {
    if ((x ^ y) >> 31)
        return 1;  // A positive number and a negative integer always add without problem
    if (x < 0)
        return (x + y) < y;
    if (x > 0)
        return (x + y) > y;
    // x == 0
    return 1;
}
// File: 2.30-test.c
// Author: iBug

#include <assert.h>

int tadd_ok(int x, int y);

int main() {
    assert(sizeof(int) == 4);

    assert(tadd_ok(0x7FFFFFFF, 0x80000000) == 1);
    assert(tadd_ok(0x7FFFFFFF, 0x7FFFFFFF) == 0);
    assert(tadd_ok(0x80000000, 0x80000000) == 0);
    return 0;
}

And commands:

gcc -o test -O0 -g3 -Wall -std=c11 2.30.c 2.30-test.c
./test

(Side note: There wasn't any -O option present in the command line, but as it defaults to level 0, explicitly adding -O0 shouldn't change much.)

The above two commands ran very well on my Ubuntu VM (amd64, GCC 7.3.0), but one of the assertions failed on my Android phone (AArch64 or armv8-a, GCC 8.2.0).

2.30-test.c:13: main: assertion "tadd_ok(0x7FFFFFFF, 0x7FFFFFFF) == 0" failed

Note that the first assertion passed, so int is guaranteed to be 4 bytes on the platforms.

So I fired up gdb on my phone trying to get some insights:

(gdb) l 2.30.c:1
1       // File: 2.30.c
2       // Author: iBug
3
4       int tadd_ok(int x, int y) {
5           if ((x ^ y) >> 31)
6               return 1;  // A positive number and a negative integer always add without problem
7           if (x < 0)
8               return (x + y) < y;
9           if (x > 0)
10              return (x + y) > y;
(gdb) b 2.30.c:10
Breakpoint 1 at 0x728: file 2.30.c, line 10.
(gdb) r
Starting program: /data/data/com.termux/files/home/CSAPP-2019/ch2/test
warning: Unable to determine the number of hardware watchpoints available.
warning: Unable to determine the number of hardware breakpoints available.

Breakpoint 1, tadd_ok (x=2147483647, y=2147483647)
    at 2.30.c:10
10              return (x + y) > y;
(gdb) p x
$1 = 2147483647
(gdb) p y
$2 = 2147483647
(gdb) p (x + y) > y
$3 = 0
(gdb) c
Continuing.
2.30-test.c:13: main: assertion "tadd_ok(0x7FFFFFFF, 0x7FFFFFFF) == 0" failed

Program received signal SIGABRT, Aborted.
0x0000007fb7ca5928 in abort ()
   from /system/lib64/libc.so
(gdb) d 1
(gdb) p tadd_ok(0x7FFFFFFF, 0x7FFFFFFF)
$4 = 1
(gdb)

As you see in the GDB output, the result is very inconsistent, as the return statement on 2.30.c:10 was reached, and the return value should have been 0, but the function still returns 1, making the assertion fail.

Kindly provide an idea what I'm getting wrong here.


Please respect what I have presented. Just saying it's UB without relating the platforms, especially GDB output, is not going to be any helpful.

like image 635
iBug Avatar asked Dec 03 '22 10:12

iBug


2 Answers

Signed overflow is Undefined Behaviour in ISO C. You can't reliably cause it and then check if it happened.

In the expression (x + y) > y;, the compiler is allowed to assume that x+y doesn't overflow (because that would be UB). Therefore, it optimizes down to checking x > 0. (Yes, really, gcc does this even at -O0).

This optimization is new in gcc8. It's the same on x86 and AArch64; you must have used different GCC versions on AArch64 and x86. (Even at -O3, gcc7.x and earlier (intentionally?) miss this optimization. clang7.0 doesn't do it either. They actually do a 32-bit add and compare. They also miss optimizing tadd_ok to return 1, or to add and checking the overflow flag (V on ARM, OF on x86). Clang's optimized asm is an interesting mix of >>31, OR and one XOR operation, but -fwrapv actually changes that asm so it's probably not doing a full overflow check.)

You could say that gcc8 "breaks" your code, but really it was already broken as far as being legal / portable ISO C. gcc8 just revealed that fact.


To see it more clearly, lets isolate just that expression into one function. gcc -O0 compiles each statement separately anyway, so the information that this only runs when x<0 doesn't affect the -O0 code-gen for this statement in your tadd_ok function.

// compiles to add and checking the carry flag, or equivalent
int unsigned_overflow_test(unsigned x, unsigned y) {
    return (x+y) >= y;    // unsigned overflow is well-defined as wrapping.
}

// doesn't work because of UB.
int signed_overflow_expression(int x, int y) {
    return (x+y) > y;
}

On the Godbolt compiler explorer with AArch64 GCC8.2 -O0 -fverbose-asm:

signed_overflow_expression:
    sub     sp, sp, #16       //,,      // make a stack fram
    str     w0, [sp, 12]      // x, x   // spill the args
    str     w1, [sp, 8]       // y, y
   // end of prologue

   // instructions that implement return (x+y) > y; as return  x > 0
    ldr     w0, [sp, 12]      // tmp94, x
    cmp     w0, 0     // tmp94,
    cset    w0, gt  // tmp95,                  // w0 = (x>0) ? 1 : 0
    and     w0, w0, 255       // _1, tmp93     // redundant

  // epilogue
    add     sp, sp, 16        //,,
    ret     

GCC -ftree-dump-original or -optimized will even turn its GIMPLE back into C-like code with this optimization done (from the Godbolt link):

;; Function signed_overflow_expression (null)
;; enabled by -tree-original

{
  return x > 0;
}

Unfortunately, even with -Wall -Wextra -Wpedantic, there's no warning about a the comparison. It's not trivially true; it still depends on x.

The optimized asm is unsurprisingly cmp w0, 0 / cset w0, gt / ret. The AND with 0xff is redundant. cset is an alias of csinc, using the zero-register as both sources. So it will produce 0 / 1. With other registers, the general case of csinc is a conditional select and increment of any 2 registers.

Anyway, cset is AArch64's equivalent of x86 setcc, for turning a flag condition into a bool in a register.


If you want your code to work as written, you'd need to compile with -fwrapv to make it well-defined behaviour in the variant of C that -fwrapv makes GCC implement. The default is -fstrict-overflow, like the ISO C standard.

If you want to check for signed overflow in modern C, you need to write checks that detect overflow without actually causing it. This is harder, annoying, and a point of contention between compiler writers and (some) developers. They argue that the language rules around undefined behaviour weren't meant to be used as an excuse to "gratuitously break" code when compiling for target machines where it would make sense in asm. But modern compilers mostly only implement ISO C (with some extensions and extra defined behaviour), even when compiling for target architectures like x86 and ARM where signed integers have no padding (and thus wrap just fine), and don't trap on overflow.

So you could say "shots fired" in that war, with the change in gcc8.x to actually "breaking" unsafe code like this. :P

See Detecting signed overflow in C/C++ and How to check for signed integer overflow in C without undefined behaviour?


Since signed and unsigned addition are the same binary operation in 2's complement, you could maybe just cast to unsigned for the add, and cast back for a signed compare. That would make a version of your function that's safe on "normal" implementations: 2's complement, and casting between unsigned and int is just a reinterpret of the same bits.

This can't have UB, it just won't give the right answer on one's complement or sign/magnitude C implementations.

return  (int)((unsigned)x + (unsigned)y) > y;

This compiles (with gcc8.2 -O3 for AArch64) to

    add     w0, w0, w1            // x+y
    cmp     w0, w1                // x+y  cmp  y
    cset    w0, gt
    ret

If you had written int sum = x+y as a separate C statement from return sum < y, this UB wouldn't be visible to gcc with optimization disabled. But as part of the same expression, even gcc with the default -O0 can see it.

Compile-time-visible UB is all kinds of bad. In this case, only certain ranges of inputs would produce UB, so the compiler assumes it doesn't happen. If unconditional UB is seen on a path of execution, an optimizing compiler can assume that path never happens. (In a function with no branching, it could assume the function is never called, and compile it to a single illegal instruction.) See Does the C++ standard allow for an uninitialized bool to crash a program? for more about compile-time-visible UB.

(-O0 doesn't mean "no optimization", it means no extra optimization besides what's already necessary to transform through gcc's internal representations on the way to asm for whatever target platform. @Basile Starynkevitch explains in Disable all optimization options in GCC)

Some other compilers may "turn their brains off" even more with optimization disabled, and do something closer to transliterating C into asm, but gcc is not like that. For example, gcc still uses a multiplicative inverse for integer division by a constant at -O0. (Why does GCC use multiplication by a strange number in implementing integer division?) All 3 other major x86 compilers (clang/ICC/MSVC) use div.

like image 132
Peter Cordes Avatar answered Dec 04 '22 23:12

Peter Cordes


Overflow of signed integers invokes undefined behavior. You can't check for an overflow condition by adding two numbers and checking if they wrap around in some way. While you might get away with this on an x86/x64 system, there's no guarantee others will behave the same.

What you can do however is some arithmetic along with INT_MAX or INT_MIN to do the check.

int tadd_ok(int x, int y) {
    if ((x ^ y) >> 31)
        return 1;  // A positive number and a negative integer always add without problem
    if (x < 0)
        return INT_MIN - x < y;
    if (x > 0)
        return INT_MAX - x > y;
    // x == 0
    return 1;
}

The expression INT_MAX - x > y is arithmetically equivalent to INT_MAX > x + y but prevents an overflow from occurring. Similarly, INT_MIN - x < y is arithmetically equivalent to INT_MIN < x + y but prevents overflow.

EDIT:

If you want to force signed integer overflow to be defined, you can use the -fwrapv option to gcc. However, you're better off avoiding overflow altogether.

like image 38
dbush Avatar answered Dec 04 '22 23:12

dbush