Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When using a coredump in gdb how do I know exactly which thread caused SIGSEGV? [duplicate]

My application uses more than 8 threads. When I run info threads in gdb I see the threads and the last function they were executing. It does not seem obvious to me exactly which thread caused the SIGSEGV. Is it possible to tell it? Is it thread 1? How are the threads numbered?

like image 275
russoue Avatar asked Jan 05 '15 20:01

russoue


People also ask

What is inside Coredump?

In computing, a core dump, memory dump, crash dump, storage dump, system dump, or ABEND dump consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has crashed or otherwise terminated abnormally.

How do I debug a core dump?

You just need a binary (with debugging symbols included) that is identical to the one that generated the core dump file. Then you can run gdb path/to/the/binary path/to/the/core/dump/file to debug it. When it starts up, you can use bt (for backtrace) to get a stack trace from the time of the crash.


1 Answers

When you use gdb to analyze the core dump file, the gdb will stop at the function which causes program core dump. And the current thread will be the murder. Take the following program as an example:

#include <stdio.h>
#include <pthread.h>
void *thread_func(void *p_arg)
{
        while (1)
        {
                printf("%s\n", (char*)p_arg);
                sleep(10);
        }
}
int main(void)
{
        pthread_t t1, t2;

        pthread_create(&t1, NULL, thread_func, "Thread 1");
        pthread_create(&t2, NULL, thread_func, NULL);

        sleep(1000);
        return;
}

The t2 thread will cause program down because it refers a NULL pointer. After the program down, use gdb to analyze the core dump file:

[root@localhost nan]# gdb -q a core.32794
Reading symbols from a...done.
[New LWP 32796]
[New LWP 32795]
[New LWP 32794]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./a'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
(gdb)

The gdb stops at __strlen_sse2 function, this means this function causes the program down. Then use bt command to see it is called by which thread:

(gdb) bt
#0  0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
#1  0x00000034e4268cdb in puts () from /lib64/libc.so.6
#2  0x00000000004005cc in thread_func (p_arg=0x0) at a.c:7
#3  0x00000034e4a079d1 in start_thread () from /lib64/libpthread.so.0
#4  0x00000034e42e8b6d in clone () from /lib64/libc.so.6
(gdb) i threads
  Id   Target Id         Frame
  3    Thread 0x7ff6104c1700 (LWP 32794) 0x00000034e42accdd in nanosleep () from /lib64/libc.so.6
  2    Thread 0x7ff6104bf700 (LWP 32795) 0x00000034e42accdd in nanosleep () from /lib64/libc.so.6
* 1    Thread 0x7ff60fabe700 (LWP 32796) 0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6

The bt command shows the stack frame of the current thread(which is the murder). "i threads" commands shows all the threads, the thread number which begins with * is the current thread.

As for "How are the threads numbered?", it depends on the OS. you can refer the gdb manual for more information.

like image 82
Nan Xiao Avatar answered Sep 22 '22 11:09

Nan Xiao