Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple inconsistent behavior of signed bit-fields

I have come across a strange behavior on signed bit-fields:

#include <stdio.h>

struct S {
    long long a31 : 31;
    long long a32 : 32;
    long long a33 : 33;
    long long : 0;
    unsigned long long b31 : 31;
    unsigned long long b32 : 32;
    unsigned long long b33 : 33;
};

long long f31(struct S *p) { return p->a31 + p->b31; }
long long f32(struct S *p) { return p->a32 + p->b32; }
long long f33(struct S *p) { return p->a33 + p->b33; }

int main() {
    struct S s = { -2, -2, -2, 1, 1, 1 };
    long long a32 = -2;
    unsigned long long b32 = 1;
    printf("f31(&s)       => %lld\n", f31(&s));
    printf("f32(&s)       => %lld\n", f32(&s));
    printf("f33(&s)       => %lld\n", f33(&s));
    printf("s.a31 + s.b31 => %lld\n", s.a31 + s.b31);
    printf("s.a32 + s.b32 => %lld\n", s.a32 + s.b32);
    printf("s.a33 + s.b33 => %lld\n", s.a33 + s.b33);
    printf("  a32 +   b32 => %lld\n",   a32 +   b32);
    return 0;
}

Using Clang on OS/X, I get this output:

f31(&s)       => -1
f32(&s)       => 4294967295
f33(&s)       => -1
s.a31 + s.b31 => 4294967295
s.a32 + s.b32 => 4294967295
s.a33 + s.b33 => -1
  a32 +   b32 => -1

Using GCC on Linux, I get this:

f31(&s)       => -1
f32(&s)       => 4294967295
f33(&s)       => 8589934591
s.a31 + s.b31 => 4294967295
s.a32 + s.b32 => 4294967295
s.a33 + s.b33 => 8589934591
  a32 +   b32 => -1

The above output shows 3 types of inconsistencies:

  • different behavior for different compilers;
  • different behavior for different bit-field widths;
  • different behavior for inline expressions and equivalent expressions wrapped in a function.

The C Standard has this language:

6.7.2 Type specifiers

...

Each of the comma-separated multisets designates the same type, except that for bit-fields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int.

Bit-fields are notoriously broken in many older compilers...
Is the behavior of Clang and GCC conformant or are these inconsistencies the result of one or more bugs?

like image 973
chqrlie Avatar asked Nov 13 '19 22:11

chqrlie


1 Answers

Is the behavior of Clang and GCC conformant or are these inconsistencies the result of one or more bugs?

I think it's most likely the fault is in your code, tbh. According to 6.7.2.1p5:

A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type.

There's no mention of long long here, so we can't necessarily treat this code as conformant to begin with. It seems that some compilers have documented support (for example, some ARM clang targets), whereas others are happy to let the behaviour be undefined (for example, gcc manuals don't appear to list long long in the category of "Allowable bit-field types other than _Bool, signed int, and unsigned int (C99 and C11 6.7.2.1)").

Furthermore, according to 6.3.1.1p2:

The following may be used in an expression wherever an int or unsigned int may be used:

  • An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
  • A bit-field of type _Bool, int, signed int, or unsigned int.

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int.

In other words, it isn't simply enough for the compiler to support these types of bit-fields, but also to have appropriate type conversions so that the expressions are converted properly. Specifically, this code looks utterly terrifying, because %lld tells printf to expect long long int, whereas I think you may only be passing an int (or unsigned, perhaps):

printf("s.a31 + s.b31 => %lld\n", s.a31 + s.b31);
printf("s.a32 + s.b32 => %lld\n", s.a32 + s.b32);
printf("s.a33 + s.b33 => %lld\n", s.a33 + s.b33);
printf("  a32 +   b32 => %lld\n",   a32 +   b32);

I figured I'd sign off quoting my expected result of this hairy looking code above:

If a conversion specification is invalid, the behavior is undefined.282) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

-- C11/7.21.6.1p9

like image 148
autistic Avatar answered Jun 10 '23 12:06

autistic