Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to analyse a crash dump file using GDB

I have a server application running under Cent OS. The server answers many requests per second but it repeatedly crashes after each hour or so and creates a crash dump file. The situation is really bad and I need to find out the crash cause as soon as possible.

I suspect that the problem is a concurrency problem but I'm not sure. I have access to the source code and crash dump files but I don't know how to use the crash dumps to pin point the problem.

Any suggestions are much appreciated.

like image 425
red.clover Avatar asked Sep 26 '09 20:09

red.clover


People also ask

How does GDB analyze core dump in Java?

You can use the gcore command in the gdb (GNU Debugger) interface to get a core image of a running process. This utility accepts the pid of the process for which you want to force the core dump. To get the list of Java processes running on the machine, you can use any of the following commands: ps -ef | grep java.


1 Answers

The first thing to look for is the error message that you get when the program crashes. This will often tell you what kind of error occurred. For example "segmentation fault" or "SIGSEGV" almost certainly mean that your program has de-referenced a NULL or otherwise invalid pointer. If the program is written in C++, then the error message will often tell you the name of any uncaught exception.

If you aren't seeing the error message, then run the program from the command line, or pipe its output into a file.

In order for a core file to be really useful, you need to compile your program without optimisation and with debugging information. GCC needs the following options: -g -O0. (Make sure your build doesn't have any other -O options.)

Once you have the core file, then open it in gdb with:

gdb YOUR-APP COREFILE

Type where to see the point where the crash occurred. You are basically in a normal debugging session - you can examine variables, move up and down the stack, switch between threads and whatever.

If your program has crashed, then it's probably an invalid memory access - so you need to look for a pointer that has zero-value, or that points to bad looking data. You might not find the problem at the very bottom of the stack, you might have to move up the stack a few levels before you find the problem.

Good luck!

like image 70
alex tingle Avatar answered Nov 04 '22 23:11

alex tingle