I have a program, where main threads creates lots of threads. It crashed, and I'm debugging core file. Crash happened in one of child threads. In order to find the reason, I need to know whether the main thread is still alive. Is there any way to find out which thread was the initial one?
Is there any way to find out which thread was the initial one?
When there are 100s of threads, I use the following technique to look through them:
(gdb) shell rm gdb.txt
(gdb) set logging on # GDB output will go to gdb.txt
(gdb) thread apply all where
Now load gdb.txt
into your editor or pager of choice, look for main
, etc.
As a general approach for UNIX-based systems, the accepted answer works as expected.
On Linux (and OSes that chose a similar POSIX threads implementation strategy), identifying the main thread can be much more straightforward. Typically, the file name of a core dump contains the PID of the faulting process (e.g. core.<pid>
) unless the core pattern (/proc/sys/kernel/core_pattern
) was changed. With that, you can reliably determine the main thread using thread find <pid>
:
$ gdb executable core.24533
[...]
(gdb) thread find 24533
Thread 7 has target id 'Thread 0x7f8ae2169740 (LWP 24533)'
(gdb) thread 7
[Switching to thread 7 (Thread 0x7f8ae2169740 (LWP 24533))]
#0 0x00007f8ae1d40017 in pthread_join (threadid=140234458433280, thread_return=0x0) at pthread_join.c:90
90 lll_wait_tid (pd->tid);
(gdb) bt
#0 0x00007f8ae1d40017 in pthread_join (threadid=140234458433280, thread_return=0x0) at pthread_join.c:90
#1 0x00007f8ae1ae40f7 in __gthread_join (__value_ptr=0x0, __threadid=<optimized out>)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/x86_64-redhat-linux/bits/gthr-default.h:668
#2 std::thread::join (this=this@entry=0x5595aac42990) at ../../../../../libstdc++-v3/src/c++11/thread.cc:107
#3 0x00005595a9681468 in operator() (t=..., __closure=<optimized out>) at segv.cxx:31
#4 for_each<__gnu_cxx::__normal_iterator<std::thread*, std::vector<std::thread> >, ThreadPool::wait()::__lambda1> (__last=..., __first=..., __f=...)
at /usr/include/c++/4.8.2/bits/stl_algo.h:4417
#5 wait (this=0x7ffcac67d860) at segv.cxx:32
#6 main (argc=<optimized out>, argv=<optimized out>) at segv.cxx:75
If the file name is missing the PID, it can be recovered from the core dump itself. The PID is stored in a note section (PT_NOTE
). Both, NT_PRSTATUS
and NT_PRPSINFO
contain the PID. In case of multiple threads, NT_PRSTATUS
exists for each individual thread including the main thread and the order is unspecified, NT_PRPSINFO
on the other hand exists only once.
The definition in case of Linux x86_64 (pr_pid
is our field of interest):
struct elf_prpsinfo
{
char pr_state; /* numeric process state */
char pr_sname; /* char for pr_state */
char pr_zomb; /* zombie */
char pr_nice; /* nice val */
unsigned long pr_flag; /* flags */
__kernel_uid_t pr_uid;
__kernel_gid_t pr_gid;
pid_t pr_pid, pr_ppid, pr_pgrp, pr_sid;
/* Lots missing */
char pr_fname[16]; /* filename of executable */
char pr_psargs[ELF_PRARGSZ]; /* initial part of arg list */
};
eu-readelf -n
(provided by elfutils
) can be used to extract the PID from NT_PRPSINFO
:
$ eu-readelf -n core
[...]
CORE 136 PRPSINFO
state: 2, sname: D, zomb: 0, nice: 0, flag: 0x0000000040402504
uid: 0, gid: 0, pid: 24533, ppid: 17322, pgrp: 24533, sid: 17299
^^^^^
fname: segv, psargs: ./segv 2
[...]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With