I already did google many times to find right solution for backtrace() in signal handler and tried almost everything but I was not able to get the backtrace successfully in my signal handler - this is not SIGUSR1 handler.
However, I was not able to get full backtrace from signal handler. Only function addresses which I've call in signal handler were printed.
If I use target-gdb binary and attach the process by using gdb --pid command, I was able to get the full backtrace properly.
Also, I tried pstack but (pstack-1.2 - tried arm-patch but it's horrible... nothing printed) not very helpful.
Any advice?
1) Compiler options in Makefile
CFLAGS += -g -fexceptions -funwind-tables -Werror $(WARN) ...
2) Code
The code is extremely simple.
#define CALLSTACK_SIZE 10
static void print_stack(void) {
int i, nptrs;
void *buf[CALLSTACK_SIZE + 1];
char **strings;
nptrs = backtrace(buf, CALLSTACK_SIZE);
printf("%s: backtrace() returned %d addresses\n", __func__, nptrs);
strings = backtrace_symbols(buf, nptrs);
if(strings == NULL) {
printf("%s: no backtrace captured\n", __func__);
return;
}
for(i = 0; i < nptrs; i++) {
printf("%s\n", strings[i]);
}
free(strings);
}
...
static void sigHandler(int signum)
{
printf("%s: signal %d\n", __FUNCTION__, signum);
switch(signum ) {
case SIGUSR2:
// told to quit
print_stack();
break;
default:
break;
}
}
Read carefully signal(7) and signal-safety(7).
A signal handler is restricted to call (directly or indirectly) only async-signal-safe-functions (practically speaking, most syscalls(2) only) and backtrace(3) or even printf(3) or malloc(3) or free
are not async-signal-safe. So your code is incorrect: the signal handler sigHandler
is calling printf
and indirectly (thru print_stack
) free
and they are not async-signal-safe.
So your only option is to use the gdb
debugger.
Read more about POSIX signal.h & signal concepts. Practically speaking, the nearly only sensible thing a signal handler can do is set some global, thread-local, or static volatile sig_atomic_t
flag, which has to be tested elsewhere. It could also directly write(2) a few bytes into a pipe(7), that your application would read elsewhere (e.g. in its event loop, if it is a GUI application).
You could also use Ian Taylor's libbacktrace
from inside GCC (assuming your program is compiled with debug info, e.g. with -g
). It is not guaranteed to work in signal handlers (since it is not using only async-signal-safe functions), but it is practically quite useful.
Notice that the kernel is setting a call frame (in the call stack) for sigreturn(2) when processing a signal.
You might also use (especially if your application is single-threaded) sigaltstack(2) to have an alternate signal stack. I'm not sure it would be helpful.
If you have an event loop, you might consider using the Linux specific signalfd(2) and ask your event loop to poll
it. For SIGTERM
or SIGQUIT
or SIGALRM
it is a quite useful trick.
I would like to add something to @Basile Starynkevitch's answer, which is overly pedantic. While it's true that your signal handler isn't async-signal-safe, there's a good chance it will often work on Linux, so if you are seeing results being printed out, that isn't what's causing your issue of not seeing relevant stack information.
Some more likely problems include:
Incorrect compiler flags for your platform. backtraces often work fine on x86 without special flags, but ARM can be more finicky. There are a few that I've tried that I can't remember, but the most important ones to try are -fno-omit-frame-pointer
and -fasynchronous-unwind-tables
.
The code that's crashing was called through code that wasn't compiled with correct flags for getting stack traces. For example, stack traces that originate in code that calls back from a .so
that wasn't compiled with correct compiler flags will often result in duplicate or truncated backtraces.
The signal that you are getting the backtrace for is not a thread-directed signal, but a process-directed one. Practically speaking a thread-directed signal is one like SIGSEGV
when the thread crashes, or one that another thread sends a specific thread with something like pthread_kill
. See man 7 signal for more information.
With that out of the way, I would like to address what you can be doing in your signal handler to get backtraces. It is true that you shouldn't be calling any stdio functions, malloc()
, free()
, etc., but it is not true that you can't call backtrace
with a sane version of glibc/libgcc. From here, you can see that backtrace_symbols_fd
is currently async-signal-safe. You can also see that backtrace
is not. It looks very unsafe. However, man 3 backtrace tell us why these restrictions apply:
backtrace_symbols_fd() does not call malloc(3), and so can be employed in situations where the latter function might fail, but see NOTES.
Later:
backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand.
A quick look at the source for backrace confirms that the unsafe parts involve dynamically loading libgcc
. You could get around this by statically linking both glibc
and libgcc
, but the most robust way of doing it is by making sure that libgcc
is loaded before any signals are generated.
The way I do this is by calling backtrace
once during program startup. Note that you must ask for at least one symbol, or the function early-outs without loading libgcc. Something like this would work:
// On linux, especially on ARM, you want to use the sigaction version of this call.
// See my comments below.
static void
handle_signal(int sig)
{
// Check signal type or whatever you want to do.
// ...
void* symbols[100];
int n = backtrace(symbols, 100);
// You could also either call a string formatting routine that you know
// is async-signal-safe or save your backtrace and let another thread know
// that this thread has crashed and the backtrace needs to be printed.
//
write(STDERR_FILENO, "Crash:\n", 7);
backtrace_symbols_fd(symbols, n, STDERR_FILENO);
// In the case of notifying another thread, which is what I do, you would
// do something like this:
//
// threadLocalSymbolCount = backtrace(threadLocalSymbols, 100);
// sem_post() or write() to an eventfd or whatever.
}
int main(int argc, char** argv)
{
void* dummy = NULL;
backtrace(&dummy, 1);
// Setup custom signal handling
// ...
function_that_crashes();
return 0;
}
EDIT: The OP mentions that they are using uclibc instead of glibc, but the same arguments apply, since it loads libgcc dynamically to get backtraces as well. An interesting point is that the source for uclibc's bactrace mentions that -fasynchronous-unwind-tables
is necessary.
NOTE: I was planning on writing a full working code example, but I remembered that you have to use the sigaction
version of signal handling and do something special to get stack traces on ARM. I have code that does it at work, and I will edit this post once I have it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With