I've used minidumps on many game projects over the years and they seem to have about a 50% chance of having a valid call stack. What can I do to make them have better call stacks? I've tried putting the latest dbghelp.dll in the exe directory. That seems to help some. Is Visual Studio 2008 or 2010 any better? (I'm still on VS 2005). The code I use looks like this sample.

One thing you can do to improve the accuracy of call stacks found in dumps is to use a debugger other than Visual Studio -- specifically, use WinDbg or another tool that uses the "Windows Debugger" debugging engine found in dbgeng.dll (as opposed to the "Visual Studio Debugger" debugging engine that Visual Studio uses). In our experience, WinDbg is 100% reliable in producing good call stacks from the same dumps where Visual Studio produces unusable or wildly inaccurate call stacks. From what I can tell, in cases where an unhandled exception is the source of the crash WinDbg automatically performs the tricky process of reconstructing/recovering the exception callstack, but Visual Studio does not (or cannot?). The two debuggers use different heuristics for interpreting stacks WinDbg can be daunting at first, so here's my quick guide on how to make it easier or even avoid having to use it directly. A Mere Mortal's Guide To Extracting Good Callstacks These are ordered from "fastest/easiest" to "slowest/most cryptic to interpret". <ol> <li>Easiest Option: use DbgDiag from Microsoft</li> </ol> This is a little-known tool that automates a lot of analysis of common problems, and it's simple enough to give to non-programmers or even customers. It's fast and nearly foolproof, and has become my "go to" tool for quickly analyzing an incoming crash dump. <ul> <li>Launch the "DebugDiag Analysis" application</li> <li>Select the "CrashHangAnalysis" checkbox on the main page</li> <li>Drag-and-drop your dump into the "Data files" pane on the main page</li> <li>Click "Start Analysis"</li> </ul> After a few seconds to a few minutes it will spit out a nice .mhtml file containing an analysis of the problem, info about all the related thread, complete call stacks, etc. All hyperlinked and easy to use. DebugDiag even automates some of the more complicated analysis that is possible but painful in WinDbg (like tracking down which of the 350 threads in your application is responsible for a deadlock). Note: Chrome will not download or open .mhtml files for security reasons, so you must open in Internet Explorer or Microsoft Edge for it to be usable. This is annoying, and I've filed a request with the DebugDiag team (dbgdiag@microsoft.com) to change the format to plain HTML <ol start="2"> <li>Middle option: Install WinDbg as an alternate debugging engine for Visual Studio</li> </ol> <ul> <li>Install Visual Studio if it's not yet installed. This needs to be done before the next step.</li> <li>Install the Windows Driver Kit (WDK) </li> <li>Launch Visual Studio, and (this part is important!) use the new "File -> Open -> Crash Dump..." option to open the dump. This will debug the crash dump using the Windows Debugger (if you instead drag-and-drop the dump on Visual Studio or use the standard "File -> Open -> File..." option to open the dump, it will debug it using the old Visual Studio debugging engine... so be careful to use the right option).</li> <li>You should now be able to see the correct call stack and navigate around using the Visual Studio GUI, although some things work differently (the watch windows require using the unfamiliar WinDbg syntax, thread IDs are different, etc). Note: the Visual Studio UI may be very sluggish, especially if many threads are involved and the 'threads' or 'parallel stacks' windows are open. </li> </ul> <ol start="2"> <li>Hardcore option: Use WinDbg directly</li> </ol> <ul> <li>Launch WinDbg.exe</li> <li>Drag-and-drop your dump into the WinDbg window</li> <li>Type <code>!analyze -v</code> and press Enter. After a little bit of time WinDbg will spit out a crash call stack, and also its estimation of what the source of the problem is. If you're analyzing a deadlock, you can try <code>!analyze -v -hang</code> and WinDbg will often show you the dependency chain involved.</li> </ul> At this point you may have all the info you need! However, if you then want to examine the process state in the Visual Studio debugger you can take the following additional steps: <ul> <li>Open the crash dump in Visual Studio</li> <li>Right-click in the callstack window and choose "Go to Disassembly"</li> <li>Paste the hex address from the top line of WinDbg's output callstack into the "Address" bar of the Disassembly window and press enter. You're now at the location of the crash, looking at the disassembled code.</li> <li>Right-click in the disassembly window and choose "Go To Source Code" to go to the source code for the location. Now you're looking at the source code at the crash site.</li> </ul> Note: all of the above require having correct symbol server paths configured, otherwise you won't be able to resolve the symbols in the call stacks. I recommend setting the _NT_SYMBOL_PATH environment variable so that it's automatically available to Visual Studio, WinDbg, and DebugDiag.

Why don't Minidumps give good call stacks?

2 Answers

One thing you can do to improve the accuracy of call stacks found in dumps is to use a debugger other than Visual Studio -- specifically, use WinDbg or another tool that uses the "Windows Debugger" debugging engine found in dbgeng.dll (as opposed to the "Visual Studio Debugger" debugging engine that Visual Studio uses).

In our experience, WinDbg is 100% reliable in producing good call stacks from the same dumps where Visual Studio produces unusable or wildly inaccurate call stacks. From what I can tell, in cases where an unhandled exception is the source of the crash WinDbg automatically performs the tricky process of reconstructing/recovering the exception callstack, but Visual Studio does not (or cannot?). The two debuggers use different heuristics for interpreting stacks

WinDbg can be daunting at first, so here's my quick guide on how to make it easier or even avoid having to use it directly.

A Mere Mortal's Guide To Extracting Good Callstacks

These are ordered from "fastest/easiest" to "slowest/most cryptic to interpret".

Easiest Option: use DbgDiag from Microsoft

This is a little-known tool that automates a lot of analysis of common problems, and it's simple enough to give to non-programmers or even customers. It's fast and nearly foolproof, and has become my "go to" tool for quickly analyzing an incoming crash dump.

Launch the "DebugDiag Analysis" application
Select the "CrashHangAnalysis" checkbox on the main page
Drag-and-drop your dump into the "Data files" pane on the main page
Click "Start Analysis"

After a few seconds to a few minutes it will spit out a nice .mhtml file containing an analysis of the problem, info about all the related thread, complete call stacks, etc. All hyperlinked and easy to use.

DebugDiag even automates some of the more complicated analysis that is possible but painful in WinDbg (like tracking down which of the 350 threads in your application is responsible for a deadlock).

Note: Chrome will not download or open .mhtml files for security reasons, so you must open in Internet Explorer or Microsoft Edge for it to be usable. This is annoying, and I've filed a request with the DebugDiag team ([email protected]) to change the format to plain HTML

Middle option: Install WinDbg as an alternate debugging engine for Visual Studio

Install Visual Studio if it's not yet installed. This needs to be done before the next step.
Install the Windows Driver Kit (WDK)
Launch Visual Studio, and (this part is important!) use the new "File -> Open -> Crash Dump..." option to open the dump. This will debug the crash dump using the Windows Debugger (if you instead drag-and-drop the dump on Visual Studio or use the standard "File -> Open -> File..." option to open the dump, it will debug it using the old Visual Studio debugging engine... so be careful to use the right option).
You should now be able to see the correct call stack and navigate around using the Visual Studio GUI, although some things work differently (the watch windows require using the unfamiliar WinDbg syntax, thread IDs are different, etc). Note: the Visual Studio UI may be very sluggish, especially if many threads are involved and the 'threads' or 'parallel stacks' windows are open.

Hardcore option: Use WinDbg directly

Launch WinDbg.exe
Drag-and-drop your dump into the WinDbg window
Type !analyze -v and press Enter. After a little bit of time WinDbg will spit out a crash call stack, and also its estimation of what the source of the problem is. If you're analyzing a deadlock, you can try !analyze -v -hang and WinDbg will often show you the dependency chain involved.

At this point you may have all the info you need! However, if you then want to examine the process state in the Visual Studio debugger you can take the following additional steps:

Open the crash dump in Visual Studio
Right-click in the callstack window and choose "Go to Disassembly"
Paste the hex address from the top line of WinDbg's output callstack into the "Address" bar of the Disassembly window and press enter. You're now at the location of the crash, looking at the disassembled code.
Right-click in the disassembly window and choose "Go To Source Code" to go to the source code for the location. Now you're looking at the source code at the crash site.

Note: all of the above require having correct symbol server paths configured, otherwise you won't be able to resolve the symbols in the call stacks. I recommend setting the _NT_SYMBOL_PATH environment variable so that it's automatically available to Visual Studio, WinDbg, and DebugDiag.

answered Nov 20 '22 21:11

Chris Kline

What's missing from your callstack? Do you have a bunch of addresses that don't resolve to valid function names (ie, 0x8732ae00 instead of CFoo:Bar())? If so, then what you need is to put your .PDBs where your debugger can find them, or set up a symbol server and set the "Symbol Paths" in the right-click context menu of the Modules pane.

We store every .PDB from every binary every time someone checks in a new Perforce changelist, so that when a dump comes back from anyone inside the office or any customer at retail, we have the .PDB corresponding to the version of the game they were running. With the symbol server and paths set, all I have to do is just double-click the .mdmp and it works every time.

Or do you have a call stack that appears to only have one function in it? Like, 0x8538cf00 without anything else above it in the stack? If so, then your crash is actually the stack itself being corrupted. If the return addresses in the backchain have been overwritten, naturally the debugger will be unable to resolve them.

Sometimes also you'll find that the thread that actually emits the minidump is not the one that threw the exception that caused the crash. Look in the Threads window to see if one of the other threads has the offending code in it.

If you are debugging a "Release" build -- that is to say, one compiled with all optimization flags turned on -- you will have to live with the fact that the debugger will have trouble finding local variables and some other data. This is because turning on optimizations means allowing the compiler to keep data on registers, collapse calculations, and generally do a variety of things that prevents data from ever actually being written to the stack. If this is your problem then you'll need to open up the disassembly window and chase the data by hand, or rebuild a debug binary and reproduce the problem where you can look at it.

answered Nov 20 '22 22:11

Crashworks

Related questions
                            
                                Is excessive DataTable usage bad?
                            
                                How to use Quotation Marks in StringFormat of Binding in WPF
                            
                                Jumping back to a previously opened file in Vim
                            
                                Using a Ruby script to login to a website via https
                            
                                Resharper Exception rethrow possibly intended
                            
                                Django order by related field
                            
                                jqgrid reload grid after successful inline update / inline creation of record
                            
                                Keep the current jQuery accordion pane open after ASP.NET postback?
                            
                                Retrieve child img src (jQuery)
                            
                                Can an interface extend the Serializable interface?
                            
                                SVN delete --keep-local
                            
                                The 'this' keyword as a property

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why don't Minidumps give good call stacks?

Tags:

Tod

People also ask

2 Answers

Chris Kline

Crashworks

Recent Activity

Donate For Us