Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What possibilities are there for post mortem analysis in .NET (e.g. after a crash of a program)?

Let's suppose there is a C# program, which is used as a windows service. Let's suppose that the service has gone wild and is consuming CPU and memory like mad. It needs to be restarted very soon, because it's a production system. So I don't have much time to gather run-time information. Maybe a quick look on the task manager ... that's all.

After that, all I have are log4net log files and the windows event log for post mortem analysis.

Suppose that I have found out the reason for the problem. Someone else fixes it and maybe the programmer adds some additional logging, so I can find a similar problem faster next time. Nevertheless: I still depend on the quality of the log files and hope that next time a problem will somehow reveil itself in the loggings.

Are there also other ways to do post mortem analysis? Maybe something like thread dumps (like in java), memory dumps or something else, which may aid in post mortem analysis? Maybe some build-in .NET framework tool can help?

I am very interested in real project experiences and how you would try to tackle this maintenance question, which I think is very real for most programmers.

like image 277
Theo Lenndorff Avatar asked Jan 18 '09 13:01

Theo Lenndorff


1 Answers

As Marc says WinDbg + SoS will let you debug a lot of problems, you can't really address in Visual Studio. There are some excellent tutorials this blog.

For memory issues you can also look at the .NET Performance counters in Perfmon. You could look at where objects are located (which generation) and how much time is spend in garbage collection. That should give you some useful information. If you want to know why object are not being collected WinDbg and SoS is the way to go. To walk you through a simple session the steps are:

  1. Inspect the heap using !dumpheap -stat, look for large number of instances. You probably have some idea of what you would expect to find on the heap at any given moment, so if anything looks out of the ordinary, look into that.

  2. Pick random instance and do a !gcroot on the address of the instance. That will tell you why the object is not being collected.

  3. Repeat

Likely candidates for keeping stuff alive longer than it should are: events, statics and the finalizer queue to name a few.

You may also want to take a look at my answer for this question to see more WinDbg stuff.

like image 98
Brian Rasmussen Avatar answered Sep 29 '22 18:09

Brian Rasmussen