Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gcc 4.7 on linux pthreads - nontrivial thread_local workaround using __thread (no boost)

In C++11 you can have a non-trivial object with thread_local storage:

class X { ... }

void f()
{
    thread_local X x = ...;
    ...
}

Unfortunately this feature hasn't been implemented in gcc yet (as of 4.7).

gcc does allow you to have thread local variables but only with trivial types.

I am looking for a workaround:

Here is what I have so far:

#include <iostream>
#include <type_traits>

using namespace std;

class X
{
public:
    X() { cout << "X::X()" << endl; };
    ~X() { cout << "X::~X()" << endl; }
};

typedef aligned_storage<sizeof(X), alignment_of<X>::value>::type XStorage;

inline void placement_delete_x(X* p) { p->~X(); }

void f()
{
        static __thread bool x_allocated = false;
        static __thread XStorage x_storage;

        if (!x_allocated)
        {
                new (&x_storage) X;
                x_allocated = true;

                // TODO: add thread cleanup that
                //     calls placement_delete_x(&x_storage)
        }

        X& x = *((X*) &x_storage);
}

int main()
{
        f();
}

What I need help with is calling placement_delete_x(&x_storage) on exit of the current thread. Is there a mechanism in pthreads and/or linux I can use to do this? I would need to add a function pointer and a parameter to some sort of pthread cleanup stack?

Update:

I think pthread_cleanup_push might be what I want:

http://www.kernel.org/doc/man-pages/online/pages/man3/pthread_cleanup_push.3.html

Will this call the cleanup handler in the correct circumstances for this usage?

Update 2:

It looks like boost::thread_specific_ptr eventually calls pthread_key_create with the destructor parameter, and not pthread_cleanup_push - to call its tls cleanup function:

http://pubs.opengroup.org/onlinepubs/009696799/functions/pthread_key_create.html

It is unclear what the difference between these two methods is, if any. ?

like image 701
Andrew Tomazos Avatar asked Aug 21 '12 06:08

Andrew Tomazos


2 Answers

pthread_key_create and friends are what you'd want to implement thread-specific variables of types with destructors. However, these generally require you to manage the whole process of creating and destroying the variables, and I'm not sure whether you could use them in conjunction with __thread.

pthread_cleanup_push is not suitable. It's intended to allow a resource to be released if the thread exits during a (short) block of code that uses that resource; as described in the documentation you link to, it must be matched by a pthread_cleanup_pop at the same level of that function, and the handler won't be called if the thread returns from its main function. That means that you can't use it if you want the thread-local variable to persist between calls to the function.

For the benefit of those who don't have a prohibition against third-party libraries, Boost provides a convenient, portable way to manage thread-local storage.

like image 71
Mike Seymour Avatar answered Nov 16 '22 19:11

Mike Seymour


As Mike says pthread_cleanup_push is not appropriate. The correct way is to use pthread_key_create.

I've implemented a small demo program to show how to do it. We implement a macro thread_local that you use like this:

With the real C++11 feature it would be:

void f()
{
    thread_local X x(1,2,3);
    ...
}

With this it is:

void f()
{
    thread_local (X, x, 1, 2, 3);
    ...
}

The difference between this and boost::thread_specifc_ptr is that there is zero dynamic memory allocation. Everything is stored with __thread duration. It is also significantly lighter-weight, but it is gcc/linux specific.

Overview:

  1. We used std::aligned_storage to make __thread duration space for the variable
  2. On first entry from a given thread we use placement new to construct the variable in the storage
  3. We also __thread allocate a linked list entry to the placement delete call
  4. We use pthread_setspecific to keep track of each threads list head
  5. The function passed to pthread_key_create walks the list calling the placement deletes when the thread exits.

...

#include <iostream>
#include <thread>

using namespace std;

static pthread_key_t key;
static pthread_once_t once_control = PTHREAD_ONCE_INIT;

struct destructor_list
{
    void (*destructor)(void*);
    void* param;
    destructor_list* next;
};

static void execute_destructor_list(void* v)
{
    for (destructor_list* p = (destructor_list*) v; p != 0; p = p->next)
        p->destructor(p->param);
}

static void create_key()
{
    pthread_key_create(&key, execute_destructor_list);
}

void add_destructor(destructor_list* p)
{
    pthread_once(&once_control, create_key);

    p->next = (destructor_list*) pthread_getspecific(key);
    pthread_setspecific(key, p);
}

template<class T> static void placement_delete(void* t) { ((T*)t)->~T(); }

#define thread_local(T, t, ...)                         \
T& t = *((T*)                                           \
({                                                      \
    typedef typename aligned_storage<sizeof(T),         \
        alignment_of<T>::value>::type Storage;          \
    static __thread bool allocated = false;             \
    static __thread Storage storage;                    \
    static __thread destructor_list dlist;              \
                                                        \
    if (!allocated)                                     \
    {                                                   \
        new (&storage) T(__VA_ARGS__);                  \
        allocated = true;                               \
        dlist.destructor = placement_delete<T>;         \
        dlist.param = &storage;                         \
        add_destructor(&dlist);                         \
    }                                                   \
                                                        \
    &storage;                                           \
}));

class X
{
public:
    int i;

    X(int i_in) { i = i_in; cout << "X::X()" << endl; };

    void f() { cout << "X::f()" << endl; }

    ~X() { cout << "X::~X() i = " << i << endl; }
};

void g()
{
    thread_local(X, x, 1234);
    x.f();
}

int main()
{
    thread t(g);
    t.join();
}

Notes:

  1. You need to add error checking to each pthread_* call. I just removed it for exposition.
  2. It uses __thread which is a GNU extension
  3. It uses an expression statement to keep the auxillary __thread variable names out of the parent scope. This is also a GNU extension.
like image 31
Andrew Tomazos Avatar answered Nov 16 '22 20:11

Andrew Tomazos