Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Crash in C++ code due to undefined behaviour or compiler bug?

I am experiencing strange crashes. And I wonder whether it is a bug in my code, or the compiler. When I compile the following C++ code with Microsoft Visual Studio 2010 as an optimized release build, it crashes in the marked line:

struct tup { int x; int y; };

class C 
{
public:
  struct tup* p;

  struct tup* operator--() { return --p; }
  struct tup* operator++(int) { return p++; }

  virtual void Reset() { p = 0;}
};

int main ()
{
  C c;
  volatile int x = 0;
  struct tup v1;
  struct tup v2 = {0, x};

  c.p = &v1;
  (*(c++)) = v2;

  struct tup i = (*(--c));   // crash! (dereferencing a NULL-pointer)
  return i.x;
}

Looking into the disassembly, it's obvious that it must crash:

int _tmain(int argc, _TCHAR* argv[])
{
00CE1000  push        ebp  
00CE1001  mov         ebp,esp  
00CE1003  sub         esp,0Ch  
  C c;
  volatile int x = 0;
00CE1006  xor         eax,eax  
00CE1008  mov         dword ptr [x],eax  
  struct tup v1;
  struct tup v2 = {0, x};
00CE100B  mov         ecx,dword ptr [x]  

  c.p = &v1;
  (*(c++)) = v2;
00CE100E  mov         dword ptr [ebp-8],ecx  

  struct tup i = (*(--c));
00CE1011  mov         ecx,dword ptr [x]  
00CE1014  mov         dword ptr [v1],eax  
00CE1017  mov         eax,dword ptr [ecx]  
00CE1019  mov         ecx,dword ptr [ecx+4]  
00CE101C  mov         dword ptr [ebp-8],ecx  
return i.x;
}
00CE101F  mov         esp,ebp  
00CE1021  pop         ebp  
00CE1022  ret  

At offset 00CE1008 it writes a 0 into x.

At offset 00CE100B it reads x (the 0) into ecx

At offset 00CE1017 it dereferences that 0-pointer.

I see two possible reasons:

  • Either there is some subtle (or not so subtle?) case of undefined behaviour in my code and the compiler "optimizes" this undefined behaviour into a crash.

  • or there is a compiler bug

Does anyone see what might cause the problem?

Thank you,

Jonas

EDIT: To address the comments regarding "pointer to invalid location"

If I change v1 to be struct tup v1[10]; and set c.p = &v1[0];, then there will be no pointer to an invalid location. But I can still observe the same behaviour. The disassembly looks marginally different, but there is still a crash and it is still caused by loading 0 into ecx and dereferencing it.

EDIT: Conclusion

So, probably it is a bug. I found out that the crash vanishes if I change

struct tup* operator--() { return --p; }

to

struct tup* operator--() { --p; return p; }

As bames53 tells us, the crash does not occur in VS2011 and concludes that it must have been fixed.

Nontheless, I decided to file that bug for two reasons:

  • The bug might still be present in VS2011. Maybe the optimizer just has changed in a way that my code doesn't trigger the bug anymore. (the bug seems to be very subtle, it doesn't occur when I remove the volative or the virtual void Reset())

  • I want to know if my workaround is a reliable way to rule out the crashes, or if code changes in other places can reintroduce the bug.

Here is the link:

https://connect.microsoft.com/VisualStudio/feedback/details/741628/error-in-code-generation-for-x86

like image 652
m3tikn0b Avatar asked May 10 '12 17:05

m3tikn0b


People also ask

What causes undefined Behaviour in C?

So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.

What is undefined value in C?

C has no specific undefined value. A function that wants to return an undefined value might indicate failure. Sometimes -1 is failure, sometimes 0 is failure, sometimes 0 is success; one has to look up the documentation to know exactly which. For a pointer, the undefined value is often pointer 0, the NULL pointer.


2 Answers

The code is fine. It's a compiler bug.

The code *(c++) = v2 will post-increment c.p yielding the original value. That value was assigned in the previous line and is &v1. So, in effect, it does v1 = v2;, which is perfectly fine.

c.p now behaves as a one-past-the-end of a one element array that holds only v1, per §5.7p4 of the standard:

For the purposes of these operators [+ and -], a pointer to a nonarray object behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

Then *(--c) moves that pointer back to &v1 and dereferences it, which is also fine.

like image 182
R. Martinho Fernandes Avatar answered Oct 25 '22 19:10

R. Martinho Fernandes


It doesn't have to be UB or compiler bug. It can be neither due to the way that VS2010 was produced.

Strictly speaking, your program exhibits well-defined behaviour. However, that might only be according to the newest C++ Standard. VS2010 is only implemented against a draft Standard which may not have included this provision. If it did not, then your code is not UB, but VS is not incorrect to produce UB, as those were the requirements of the time it was made.

Of course, if it's been legal to treat stack objects as an array of one object in C++03, then it is a compiler bug.

Edit: If you still get the crash for an array as you state, then that is definitely a compiler bug.

like image 20
Puppy Avatar answered Oct 25 '22 17:10

Puppy