Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to attach gdb to a crashed process (a.k.a "just-in-time" debugging)

When a process crashes I want the possibility to invoke gdb (or a similar debugger) against it in that crashed-but-not-cleaned-up state. Often post-morteming a core dump gives enough information but sometimes I want to explore the running state further, possibly suppressing the immediate fault and running a little further. It isn't always appropriate to run the process under gdb from the outset (e.g. where the invocation is complex or the bug is absurdly timing-sensitive)

What I'm describing is basically the just-in-time debugging facility that is exposed on MS Windows through the "AEDebug" registry key: leaving the faulting thread suspended while doing something diagnostic. On non-developer Windows PCs this is commonly set to a crash diagnostic mechanism (formerly "Dr Watson"), for which the Ubuntu equivalent seems to be "apport".

I did find an old mail thread (2007) which refers to this question "popping up every now and then", so possibly it exists but described in a way that eludes my searches?

like image 815
Tom Goodfellow Avatar asked Mar 18 '14 14:03

Tom Goodfellow


3 Answers

I don't know if such a feature exist, but as a hack, you could LD_PRELOAD something that adds a handler on SIGSEGV that calls gdb:

cat >> handler.c << 'EOF'
#include <stdlib.h>
#include <signal.h>
void gdb(int sig) {
  system("exec xterm -e gdb -p \"$PPID\"");
  abort();
}

void _init() {
  signal(SIGSEGV, gdb);
}
EOF
gcc -g -fpic -shared -o handler.so -nostartfiles handler.c

And then run your applications with:

LD_PRELOAD=/path/to/handler.so your-application

Then, upon a SEGV, it will run gdb in a xterm. If you do a bt there, you'll see something like:

(gdb) bt
#0  0x00007f8c58152cac in __libc_waitpid (pid=8294,
    stat_loc=stat_loc@entry=0x7fffd6170e40, options=options@entry=0)
    at ../sysdeps/unix/sysv/linux/waitpid.c:31
#1  0x00007f8c580df01b in do_system (line=<optimized out>)
    at ../sysdeps/posix/system.c:148
#2  0x00007f8c58445427 in gdb (sig=11) at ld.c:4
#3  <signal handler called>
#4  strlen () at ../sysdeps/x86_64/strlen.S:106
#5  0x00007f8c5810761c in _IO_puts (str=0x0) at ioputs.c:36
#6  0x000000000040051f in main (argc=1, argv=0x7fffd6171598) at a.c:2

Instead of running gdb, you could also suspend yourself (kill(getpid(), SIGSTOP) or call pause() to start gdb yourself at your leisure.

That approach won't work if the application install a SEGV handler itself or is setuid/setgid...

That's the approach used by @yugr for his libdebugme tool, which you could use here as:

DEBUGME_OPTIONS='xterm:handle_signals=1' \
  LD_PRELOAD=/path/to/libdebugme.so your-application
like image 53
Stephane Chazelas Avatar answered Oct 11 '22 14:10

Stephane Chazelas


Answering my own question to include the fleshed-out code I derived from the true answer (@Stephane Chazelas above). Only real changes to the original answer are:

  1. setting PR_SET_PTRACER_ANY to allow gdb to attach
  2. a little more (futile?) trying to avoid libc code in the hopes of still working for (some) heap corruptions
  3. included SIGABRT because some of the crashes are assert()s

I've been using it with Linux Mint 16 (kernel 3.11.0-12-generic)

/* LD_PRELOAD library which launches gdb "just-in-time" in response to a process SIGSEGV-ing
 * Compile with:
 *
 * gcc -g -fpic -shared -nostartfiles -o jitdbg.so jitdbg.c
 * 
 * then put in LD_PRELOAD before running process, e.g.:
 * 
 * LD_PRELOAD=~/scripts/jitdbg.so defective_executable
 */

#include <unistd.h>
#include <signal.h>
#include <sys/prctl.h>


void gdb(int sig) {
  if(sig == SIGSEGV || sig == SIGABRT)
    {
      pid_t cpid = fork();
      if(cpid == -1)
        return;   // fork failed, we can't help, hope core dumps are enabled...
      else if(cpid != 0)
        {
          // Parent
          prctl(PR_SET_PTRACER, PR_SET_PTRACER_ANY, 0, 0, 0);  // allow any process to ptrace us
          raise(SIGSTOP);  // wait for child's gdb invocation to pick us up
        }
      else
        {
          // Child - now try to exec gdb in our place attached to the parent

          // Avoiding using libc since that may already have been stomped, so building the
          // gdb args the hard way ("gdb dummy PID"), first copy
          char cmd[100];
          const char* stem = "gdb _dummy_process_name_                   ";  // 18 trailing spaces to allow for a 64 bit proc id
          const char*s = stem;
          char* d = cmd; 
          while(*s)
            {
            *d++ = *s++;
            }
          *d-- = '\0';
          char* hexppid = d;

          // now backfill the trailing space with the hex parent PID - not
          // using decimal for fear of libc maths helper functions being dragged in
          pid_t ppid = getppid();
          while(ppid)
            {
              *hexppid = ((ppid & 0xF) + '0');
              if(*hexppid > '9')
                *hexppid += 'a' - '0' - 10;
              --hexppid;
              ppid >>= 4;
            }
          *hexppid-- = 'x';   // prefix with 0x
          *hexppid = '0';
          // system() isn't listed as safe under async signals, nor is execlp, 
          // or getenv. So ideally we'd already have cached the gdb location, or we
          // hardcode the gdb path, or we accept the risk of re-entrancy/library woes
          // around the environment fetch...
          execlp("mate-terminal", "mate-terminal", "-e", cmd, (char*) NULL);
        }
    }
}

void _init() {
  signal(SIGSEGV, gdb);
  signal(SIGABRT, gdb);
}
like image 37
Tom Goodfellow Avatar answered Oct 11 '22 13:10

Tom Goodfellow


If you are able to anticipate that a particular program will crash, you could start it under gdb.

gdb /usr/local/bin/foo
> run

If the program crashes, gdb will catch it and let you continue to investigate.

If you are not able to predict when and which program will crash, then you could enable core dumps system wide.

ulimit -c unlimited

Force a core dump of the foo process

/usr/local/sbin/foo
kill -11 `pidof foo` #kill -3 likely will also work

A core file should be generated which you can attach gdb to

gdb attach `which foo` -c some.core

RedHat systems sometimes require additional configuration besides the ulimit to enable core dumps.

http://www.akadia.com/services/ora_enable_core.html

like image 42
spuder Avatar answered Oct 11 '22 12:10

spuder