How to debug a segmentation fault while the gdb stack trace is full of '??'?

Tags:

My executable contains symbol table. But it seems that the stack trace is overwrited.

How to get more information out of that core please? For instance, is there a way to inspect the heap ? See the objects instances populating the heap to get some clues. Whatever, any idea is appreciated.

597

asked Mar 10 '10 17:03

yves Baumes

5 Answers

I am a C++ programmer for a living and I have encountered this issue more times than i like to admit. Your application is smashing HUGE part of the stack. Chances are the function that is corrupting the stack is also crashing on return. The reason why is because the return address has been overwritten, and this is why GDB's stack trace is messed up.

This is how I debug this issue:

1)Step though the application until it crashes. (Look for a function that is crashing on return).

2)Once you have identified the function, declare a variable at the VERY FIRST LINE of the function:

int canary=0;

(The reason why it must be the first line is that this value must be at the very top of the stack. This "canary" will be overwritten before the function's return address.)

3) Put a variable watch on canary, step though the function and when canary!=0, then you have found your buffer overflow! Another possibility it to put a variable breakpoint for when canary!=0 and just run the program normally, this is a little easier but not all IDE's support variable breakpoints.

EDIT: I have talked to a senior programmer at my office and in order to understand the core dump you need to resolve the memory addresses it has. One way to figure out these addresses is to look at the MAP file for the binary, which is human readable. Here is an example of generating a MAP file using gcc:

gcc -o foo -Wl,-Map,foo.map foo.c

This is a piece of the puzzle, but it will still be very difficult to obtain the address of function that is crashing. If you are running this application on a modern platform then ASLR will probably make the addresses in the core dump useless. Some implementation of ASLR will randomize the function addresses of your binary which makes the core dump absolutely worthless.

190

answered Sep 28 '22 22:09

rook

You have to use some debugger to detect, valgrind is ok
while you are compiling your code make sure you add -Wall option, it makes compiler will tell you if there are some mistakes or not (make sure you done have any warning in your code).

ex: gcc -Wall -g -c -o oke.o oke.c
3. Make sure you also have -g option to produce debugging information. You can call debugging information using some macros. The following macros are very useful for me:

__LINE__ : tells you the line

__FILE__ : tells you the source file

__func__ : tells yout the function

Using the debugger is not enough I think, you should get used to to maximize compiler ablity.

Hope this would help

answered Sep 30 '22 22:09

deddihp

TL;DR: extremely large local variable declarations in functions are allocated on the stack, which, on certain platform and compiler combinations, can overrun and corrupt the stack.

Just to add another potential cause to this issue. I was recently debugging a very similar issue. Running gdb with the application and core file would produce results such as:

Core was generated by `myExecutable myArguments'.
Program terminated with signal 6, Aborted.
#0  0x00002b075174ba45 in ?? ()
(gdb)

That was extremely unhelpful and disappointing. After hours of scouring the internet, I found a forum that talked about how the particular compiler we were using (Intel compiler) had a smaller default stack size than other compilers, and that large local variables could overrun and corrupt the stack. Looking at our code, I found the culprit:

void MyClass::MyMethod {
   ...
   char charBuffer[MAX_BUFFER_SIZE];
   ...

}

Bingo! I found MAX_BUFFER_SIZE was set to 10000000, thus a 10MB local variable was being allocated on the stack! After changing the implementation to use a shared_ptr and create the buffer dynamically, suddenly the program started working perfectly.

answered Sep 27 '22 22:09

Ogre Psalm33

Try running with Valgrind memory debugger.

answered Sep 29 '22 22:09

Tronic

To confirm, was your executable compiled in release mode, i.e. no debug symbols....that could explain why there's ?? Try recompiling with -g switch which 'includes debugging information and embedding it into the executable'..Other than that, I am out of ideas as to why you have '??'...

answered Sep 28 '22 22:09

t0mm13b

Related questions
                            
                                Is there a bug in GCC 4.7.2's implementation of shared_ptr's (templated) assignment operator?
                            
                                Why does a virtual function with only a declaration result in a compiler error?
                            
                                increase c++ code verbosity with macros
                            
                                Calling constructor with braces
                            
                                Qt5 C++ QGraphicsView: Images don't fit view frame
                            
                                How can I parse JSON arrays with C++ Boost?
                            
                                comma operator in c++ doesn't evaluate second expression
                            
                                What is the practical use of protected inheritance?
                            
                                Explicit specialization of function templates causes linker error
                            
                                What is a stream exactly?
                            
                                Why doesn't narrowing conversion used with curly-brace-delimited initializer cause an error?
                            
                                Weird syntax when overriding virtual functions
                            
                                Linux, C++, ThirdParty libs
                            
                                Getting around copy semantics in C++
                            
                                Why does shared_ptr use placement new
                            
                                Memory Alignment in C/C++
                            
                                Why can't "is_base_of" be used inside a class declaration (incomplete type)?
                            
                                Function call with pointer to non-const and pointer to const arguments of same address
                            
                                Using Boost on ubuntu
                            
                                Cairo and Qt integration

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With