Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I trace an intermittent crash that occurs only under the debugger, but is not caught by it?

I have an odd intermittent crash that only occurs under some circumstances that I am having trouble solving, and I'm seeking SO's advice for how to tackle it.

The bug

At apparently random points, Windows shows the "[App] has stopped working" dialog. It is an APPCRASH in ntdll.dll, exception code 4000001f, exception offset 000a2562. Here's where it gets tricky: this only occurs when running the application under the debugger. However, the debugger does not catch this exception, and at the point where Windows shows this dialog, the IDE is not responding. This bug does not occur when running normally, i.e. not within the IDE debugger.

Screenshot of the Windows crash dialog

I can't reproduce it outside the debugger, so I can't run the program and attach when it's already crashed. I can't pause execution when Windows shows this dialog, since the IDE isn't responding. I can manually trace through lines of code to see where it occurs. There are several, and where it occurs is apparently random. For a while it occurred when showing a window (or new form), for a while when creating a thread.

Edit: I have tracked it down to the IDE: if I pause on a breakpoint and click the Thread Status tab, the program will crash immediately with the above dialog even though it is, theoretically, paused. In this situation, the IDE remains responsive. This is really weird.

More information

I have just moved my development environment to VMWare Fusion. The bug also occurs running a build from my old (native Windows) computer on my new computer; it did not occur with the same EXE file on that old computer. This makes me wonder if it is related to Fusion or something in my new setup.

I am running:

  • Windows 7 Pro x64 on WMWare Fusion 3.1.3 on OSX Lion 10.7.1, all fully updated. Fusion is running in "Full screen" mode on one of my screens.
  • A colleague running Windows 7 natively (not in a VM) does not encounter this issue. Nor did I on my old Vista computer.
  • Embarcadero RAD Studio 2010, fully updated (I hope; there are about five updates and getting them all in order is tricky.) I have DDevExtensions 2.4.1 installed, and the latest IDE Fix Pack too: uninstalling both these has no effect.
  • The application is written mostly in C++, with snippets of Delphi. It is 32-bit.
  • We use EurekaLog, but the exception is not caught by it either. (Normally, an exception would be caught first by the debugger, then by EurekaLog.)
  • Running a debug build (no EurekaLog, extra debug info etc, debug DCUs set to true) also reproduces it. However, the "Debug DCUs" option on The Delphi Linking page of the C++Builder project settings dialog seems to have no effect - I can't step into the VCL code and find the line that actually triggers the error.
  • Codeguard (which detects memory access errors, double frees, access in freed memory, buffer overruns, etc) reports nothing.
like image 940
David Avatar asked Aug 22 '11 06:08

David


4 Answers

This has all the hallmarks of a memory corruption. It only appears when you run under a one particular environment, and occurs at a different location each time. Both classic symptoms.

The best way I know to debug this is to download the full FastMM and run with full debugging options enabled.

If that doesn't help then you are reduced to removing parts of code, one by one, until you can isolate the problem.

Another problem I have seen in D2010 is a problem when mixing local class definitions (i.e. class inside class) with generics. The code generated is fine but the debug DCUs are wrong and when stepping through the code the debugger jumps to the wrong file and dies shortly after. You don't seem to have quite the same problem but there are similarities in the IDE deaths.

Finally I would advise you to suspect your own code rather than VMware. It's always tempting to blame something else but in my experience, whenever I have done so, it was always my code in the end!

like image 111
David Heffernan Avatar answered Nov 10 '22 16:11

David Heffernan


I hit a quite similar problem. I've also been developing a .dll and when I've set a breakpoint anywhere in my code, Delphi stopped at the source code line and the host-application crashed immediately.

Closing the "Thread Status Window" in debug layout "fixed" the problem. I'm working on Windows 7 64-bit and Delphi XE3.

like image 25
TmTron Avatar answered Nov 10 '22 17:11

TmTron


4000001F is STATUS_WX86_BREAKPOINT

In other words, it is INT 3, which was not handled by IDE.

Since it is raised in NTDLL - I would guess that this is indication of memory corruption in system heap. Remember, some Windows code would switch to debugger version when running under debugger. That's why you can not reproduce this when application is running as standalone outside of the debugger - because breakpoint is not generated.

You may try FastMM in full debug mode, but I do not think that it will help you. The corruption does not happen in your memory, it happens in system memory. Yes, perhaps memory allocation scheme will be changed - and your corruption will reveal itself in your code/memory... may be. Try use top-down allocations, try use SafeMM...

Another possible approach would be using Application Verifier.

See also:

  • Windows has generated a breakpoint
  • C++ error on Ms Visual Studio: "Windows has triggered a breakpoint in javaw.exe"
  • http://blogs.msdn.com/b/oldnewthing/archive/2012/01/25/10260334.aspx
  • http://blogs.msdn.com/b/oldnewthing/archive/2013/12/27/10484882.aspx
like image 2
Alex Avatar answered Nov 10 '22 17:11

Alex


Check the The projects dsk file and make sure it does not have a reference pointing to the wrong unit. The fix is to open the dsk in an editor and change the file location to the correct location.

like image 1
House of Dexter Avatar answered Nov 10 '22 15:11

House of Dexter