Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find source of zombie threads

I have an application which I'm pretty sure 'leaks' threads by forgetting to call pthread_join on them. So their stacks are not cleared up and the process over time consumes huge amounts of virtual address space.

Is there a way to find a place in the software which creates those threads or at least find out what those threads were doing before exiting?

My application is big and creates a lot of thread which are correctly joined. So catching all pthread operations is impractical. I need something more precise.

I was able to come up with an isolated reproducer of what I think is happening.

#include <pthread.h>
#include <unistd.h>

void* worker (void* unusued)
{
    // Do nothing
}

int main()
{
    pthread_t thread_id;

    for(int i=0; i < 2000; i++)
    {
            pthread_create(&thread_id, NULL, &worker, NULL);
    }
    sleep(1000);
    return 0;
}

After running it, 'top' shows that 16GB of virtual address space is consumed

enter image description here

But 'ps' and 'gdb' show only one thread

enter image description here

enter image description here

I have sources for everything in my application. So I can add any code or other instrumentation needed.

In other words, how having a running instance of the above application find out that it has 2000 'lost' threads and how to find out that they executed worker() function?

like image 987
Dennis Avatar asked Mar 15 '23 05:03

Dennis


1 Answers

Good question. One possible answer is to use libpthread interposer. See this article.

Let's make your test program a bit more interesting, so it "leaks" only a few threads, and joins most of them:

#include <pthread.h>
#include <unistd.h>

void* worker(void* unusued)
{
  // Do nothing
}

int main()
{
  pthread_t thread_id;

  for (int i = 0; i < 10; i++) {
    pthread_create(&thread_id, NULL, &worker, (void*)i);
    if (i != 4 && i != 7) pthread_join(thread_id, NULL);
  }
  sleep(1000);
  return 0;
}

Now let's build an interposer for pthread_create and pthread_join:

#include <assert.h>
#include <dlfcn.h>
#include <pthread.h>
#include <map>

static pthread_mutex_t mtx;
typedef std::pair<void *, void *> elem_t;
typedef std::map<pthread_t, elem_t> map_t;
static map_t thr_map;

extern "C"
int pthread_create(pthread_t *tid, const pthread_attr_t *attr,
                   void *(*start_routine)(void*), void *arg)
{
  static __decltype(pthread_create) *real
    = reinterpret_cast<__decltype(pthread_create) *>(dlsym(RTLD_NEXT,
                                                           "pthread_create"));
  int rc = (*real)(tid, attr, start_routine, arg);
  if (rc == 0) {
    pthread_mutex_lock(&mtx);
    thr_map[*tid] = std::make_pair((void*)start_routine, arg);
    pthread_mutex_unlock(&mtx);
  }
  return rc;
}

extern "C"
int pthread_join(pthread_t tid, void **arg)
{
  static __decltype(pthread_join) *real
    = reinterpret_cast<__decltype(pthread_join) *>(dlsym(RTLD_NEXT,
                                                         "pthread_join"));
  int rc = (*real)(tid, arg);
  if (rc == 0) {
    pthread_mutex_lock(&mtx);
    const auto it = thr_map.find(tid);
    assert(it != thr_map.end());
    thr_map.erase(it);
    pthread_mutex_unlock(&mtx);
  }
  return rc;
}

Build it: g++ -g -fPIC -shared -o thr.so thr.cc -ldl -std=c++11 and use it:

LD_PRELOAD=./thr.so ./a.out &
[1] 37057

gdb -q -p 37057

Attaching to process 37057
Reading symbols from /tmp/a.out...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".
0x00007f95831a2f3d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
81  ../sysdeps/unix/syscall-template.S: No such file or directory.

(gdb) set print pretty
(gdb) p thr_map
$1 = std::map with 2 elements = {
  [140280106567424] = {
    first = 0x40069d <worker(void*)>,
    second = 0x7
  },
  [140280114960128] = {
    first = 0x40069d <worker(void*)>,
    second = 0x4
  }
}

Voilà: you now know which threads have not been joined, which routine(s) they were invoked with, and what argument was given to them.

EDIT

My application is linked statically

In that case, linker --wrap=pthread_create and --wrap=pthread_join are your friends. Documentation here.

like image 94
Employed Russian Avatar answered Mar 23 '23 21:03

Employed Russian