Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is this a compiler optimisation bug, or an undefined behaviour?

We have an annoying bug I can't explain around this piece of code:

unsigned char bitmap[K_BITMAP_SIZE] = {0} ; SetBit(bitmap, K_18); // Sets the bit #18 to 1  for(size_t i = 0; i < K_END; ++i) {     if(TestBit(bitmap, i)) // true for 18     {         size_t i2 = getData(i); // for 18, will return 15         SetBit(bitmap, i2); // BUG: IS SUPPOSED TO set the bit #15 to 1     } } 
  1. It happens on Visual C++ 2010
  2. It happens both on 32-bit and 64-bit builds
  3. It happens only on Release builds (with "Maximize Speed (/O2)" set
  4. It does not happen only on Release builds with "Minimize Size (/O1)" set
  5. It happens on Visual C++ 2008 only if we __forceinline the getData function (by default, VC++2008 does not inline that function, while VC++2010 does)
  6. It happens on the piece of code given below, probably because massive inlining inside the loop
  7. It doesn't happen if we remove the loop, and directly set the interesting value (18)

Bonus info:

1- BenJ commented the issue does not appear on Visual C++ 2012, meaning this could well be a bug in the compiler

2- If we add a cast to unsigned char in the Test/Set/ResetBit functions, the bug disappears, too

size_t TestBit(const unsigned char * bits, size_t pos) { return (((bits)[(pos) >> 3]) &   (1 << (unsigned char)((pos) & 7))) ; } size_t SetBit(unsigned char * bits, size_t pos)        { return (((bits)[(pos) >> 3]) |=  (1 << (unsigned char)((pos) & 7))) ; } size_t ResetBit(unsigned char * bits, size_t pos)      { return (((bits)[(pos) >> 3]) &= ~(1 << (unsigned char)((pos) & 7))) ; } 

The question is:

Does this bug happens because our code relies on undefined behaviour, or is there some bug in the VC++2010 compiler?

The following source is self-sufficient, and can be compiled as such on your favorite compiler:

#include <iostream>   const size_t K_UNKNOWN              = (-1) ; const size_t K_START                = (0) ; const size_t K_12                   = (K_START + 12) ; const size_t K_13                   = (K_START + 13) ; const size_t K_15                   = (K_START + 15) ; const size_t K_18                   = (K_START + 18) ; const size_t K_26                   = (K_START + 26) ; const size_t K_27                   = (K_START + 27) ; const size_t K_107                  = (K_START + 107) ; const size_t K_128                  = (K_START + 128) ; const size_t K_END                  = (K_START + 208) ; const size_t K_BITMAP_SIZE          = ((K_END/8) + 1) ;   size_t TestBit(const unsigned char * bits, size_t pos) { return (((bits)[(pos) >> 3]) &   (1 << ((pos) & 7))) ; } size_t SetBit(unsigned char * bits, size_t pos)        { return (((bits)[(pos) >> 3]) |=  (1 << ((pos) & 7))) ; } size_t ResetBit(unsigned char * bits, size_t pos)      { return (((bits)[(pos) >> 3]) &= ~(1 << ((pos) & 7))) ; }   size_t getData(size_t p_value) {     size_t value = K_UNKNOWN;      switch(p_value)     {         case K_13:      value = K_12;        break;         case K_18:      value = K_15;        break;         case K_107:     value = K_15;        break;         case K_27:      value = K_26;        break;         case K_128:     value = K_12;        break;         default:        value = p_value;     break;     }      return value; }   void testBug(const unsigned char * p_bitmap) {     const size_t byte = p_bitmap[1] ;     const size_t bit  = 1 << 7 ;     const size_t value = byte & bit ;      if(value == 0)     {         std::cout << "ERROR : The bit 15 should NOT be 0" << std::endl ;     }     else     {         std::cout << "Ok : The bit 15 is 1" << std::endl ;     } }   int main(int argc, char * argv[]) {     unsigned char bitmap[K_BITMAP_SIZE] = {0} ;     SetBit(bitmap, K_18);      for(size_t i = 0; i < K_END; ++i)     {         if(TestBit(bitmap, i))         {             size_t i2 = getData(i);             SetBit(bitmap, i2);         }     }      testBug(bitmap) ;      return 0; } 

Some background info: Initially:

  1. the Test/Set/ResetBit functions were macros.
  2. the constants were defines
  3. the indices were either long or int (on Windows 32-bit, they have the same size)

If needed, I'll add a few more info (e.g. the generated assembler for both configurations, update on how g++ handle the problem), as soon as possible.

like image 484
paercebal Avatar asked Oct 08 '12 09:10

paercebal


People also ask

What is undefined behavior in programming?

In computer programming, undefined behaviour is defined as 'the result of compiling computer code which is not prescribed by the specs of the programming language in which it is written'.

What do you mean by compiler Optimisation?

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption (the last three being popular for portable computers).

How do I disable compiler optimization?

Use the command-line option -O0 (-[capital o][zero]) to disable optimization, and -S to get assembly file. Look here to see more gcc command-line options.


1 Answers

This is a code optimizer bug. It inlines both getData() and SetBit(). The combination appears to be fatal, it loses track of the value of 1 << ((pos) & 7) and always produces zero.

This bug does not occur on VS2012. A workaround is to force one of the functions to not get inlined. Given the code, you probably want to do that for getData():

__declspec(noinline) size_t getData(size_t p_value) {      // etc.. } 
like image 82
Hans Passant Avatar answered Sep 20 '22 17:09

Hans Passant