Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory leak when using shared library with thread local storage via ctypes in a python program

I am using the ctypes module in python to load a shared c-library , which contains thread local storage. Its a quite large c-library with a long history, that we are trying to make thread safe. The library contains lots of global variables and statics, so our initial strategy towards thread safety has been to use thread local storage. We want our libarary to be platform independent, and have been compiling and testing thread safety on both win32, win64 and 64-bit Ubuntu. From a pure c-process there doesn't seem to be any problems.

However in python (2.6 and 2.7) on win32 and on Ubuntu we are seeing memory leaks. It seems like the thread local storage is not being released properly when a python thread terminates. Or at least that somehow the python process is not "aware" about that the memory is freed. The same problem is also seen in a c#-program on win32 actually, but it is not present on our win64 server test machine (running python 2.7 also).

The problem can be reproduced with a simple toy example like this:

Create a c-file containing (on linux/unix remove __declspec(dllexport)):

#include <stdio.h>
#include <stdlib.h>
void __declspec(dllexport) Leaker(int tid){
    static __thread double leaky[1024];
    static __thread int init=0;
    if (!init){
          printf("Thread %d initializing.", tid);
          int i;
          for (i=0;i<1024;i++) leaky[i]=i;
          init=1;}
    else
        printf("This is thread: %d\n",tid);
    return;}

Compile wit MINGW on windows/gcc on linux like:

gcc -o leaky.dll (or leaky.so) -shared the_file.c

On windows we could have compiled with Visual Studio, replacing __thread with __declspec(thread). However on win32 (up to winXP I believe), this does not work if the library is to be loaded in runtime with LoadLibrary.

Now create a python program like:

import threading, ctypes, sys, time
NRUNS=1000
KEEP_ALIVE=5
REPEAT=2
lib=ctypes.cdll.LoadLibrary("leaky.dll")
lib.Leaker.argtypes=[ctypes.c_int]
lib.Leaker.restype=None
def UseLibrary(tid,repetitions):
    for i in range(repetitions):
        lib.Leaker(tid)
        time.sleep(0.5)
def main():
    finished_threads=0
    while finished_threads<NRUNS:
        if threading.activeCount()<KEEP_ALIVE:
            finished_threads+=1
            thread=threading.Thread(target=UseLibrary,args=(finished_threads,REPEAT))
            thread.start()
    while threading.activeCount()>1:
        print("Active threads: %i" %threading.activeCount())
        time.sleep(2)
    return
if __name__=="__main__":
    sys.exit(main())

That is enough to reproduce the error. Explicitly import the garbage collector, doing a collect gc.collect() when starting each new thread does not help.

For a while I thought that the problem had to do with incompatible runtimes (python compiled with Visual Studio, my library with MINGW). But the problem is also on Ubuntu, but not on a win64 server, even when the library is cross compiled with MINGW.

Hope that anyone can help!

Cheers, Simon Kokkendorff, National Survey and Cadastre of Denmark.

like image 240
user1037171 Avatar asked Nov 10 '11 08:11

user1037171


2 Answers

This seems not to be ctypes' or Python's fault at all. I can reproduce the same leak, leaking at the same rate, by writing only C code.

Strangely, at least on Ubuntu Linux 64, the leak occurs if the Leaker() function with the __thread variables is compiled as an .so and called from a program with dlopen(). It does not occur when running exactly the same code but with both parts compiled together as a regular C program.

I suspect that the fault is some interaction between dynamically linked libraries and thread-local storage. Still, it looks like a rather bad bug (is it really undocumented?).

like image 123
Armin Rigo Avatar answered Oct 17 '22 18:10

Armin Rigo


My guess is that not joining with the threads is the problem. From the man page for pthread_join:

Failure to join with a thread that is joinable (i.e., one that is not detached), produces a "zombie thread". Avoid doing this, since each zombie thread consumes some system resources, and when enough zombie threads have accumulated, it will no longer be possible to create new threads (or processes).

If you modify your loop to collect the thread objects and use .isAlive() and .join() on them in that last while loop I think it should take care of your memory leak.

like image 32
David K. Hess Avatar answered Oct 17 '22 18:10

David K. Hess