Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GCC, -O2, and bitfields - is this a bug or a feature?

Today I discovered alarming behavior when experimenting with bit fields. For the sake of discussion and simplicity, here's an example program:

#include <stdio.h>

struct Node
{
  int a:16 __attribute__ ((packed));
  int b:16 __attribute__ ((packed));

  unsigned int c:27 __attribute__ ((packed));
  unsigned int d:3 __attribute__ ((packed));
  unsigned int e:2 __attribute__ ((packed));
};

int main (int argc, char *argv[])
{
  Node n;
  n.a = 12345;
  n.b = -23456;
  n.c = 0x7ffffff;
  n.d = 0x7;
  n.e = 0x3;

  printf("3-bit field cast to int: %d\n",(int)n.d);

  n.d++;  

  printf("3-bit field cast to int: %d\n",(int)n.d);
}

The program is purposely causing the 3-bit bit-field to overflow. Here's the (correct) output when compiled using "g++ -O0":

3-bit field cast to int: 7

3-bit field cast to int: 0

Here's the output when compiled using "g++ -O2" (and -O3):

3-bit field cast to int: 7

3-bit field cast to int: 8

Checking the assembly of the latter example, I found this:

movl    $7, %esi
movl    $.LC1, %edi
xorl    %eax, %eax
call    printf
movl    $8, %esi
movl    $.LC1, %edi
xorl    %eax, %eax
call    printf
xorl    %eax, %eax
addq    $8, %rsp

The optimizations have just inserted "8", assuming 7+1=8 when in fact the number overflows and is zero.

Fortunately the code I care about doesn't overflow as far as I know, but this situation scares me - is this a known bug, a feature, or is this expected behavior? When can I expect gcc to be right about this?

Edit (re: signed/unsigned) :

It's being treated as unsigned because it's declared as unsigned. Declaring it as int you get the output (with O0):

3-bit field cast to int: -1

3-bit field cast to int: 0

An even funnier thing happens with -O2 in this case:

3-bit field cast to int: 7

3-bit field cast to int: 8

I admit that attribute is a fishy thing to use; in this case it's a difference in optimization settings I'm concerned about.

like image 563
Rooke Avatar asked May 14 '10 19:05

Rooke


1 Answers

If you want to get technical, the minute you used __attribute__ (an identifier containing two consecutive underscores) your code has/had undefined behavior.

If you get the same behavior with those removed, it looks to me like a compiler bug. The fact that a 3-bit field is being treated as 7 means that it's being treated as an unsigned, so when you overflow it should do like any other unsigned, and give you modulo arithmetic.

It would also be legitimate for it to treat the bit-field as signed. In this case the first result would be -1, -3 or -0 (which might print as just 0), and the second undefined (since overflow of a signed integer gives undefined behavior). In theory, other values might be possible under C89 or the current C++ standard since they don't limit the representations of signed integers. In C99 or C++0x, it can only be those three (C99 limits signed integers to one's complement, two's complement or sign-magnitude and C++0x is based on C99 instead of C90).

Oops: I didn't pay close enough attention -- since it's defined as unsigned, it has to be treated as unsigned, leaving little wiggle room for getting out of its being a compiler bug.

like image 107
Jerry Coffin Avatar answered Sep 23 '22 15:09

Jerry Coffin