Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pthreads: how to assert code is run in a single threaded context

I am writing a C library which needs to fork() during initialization. Therefore, I want to assert() that the application code (which is outside of my control) calls my library initialization code from a single threaded context (to avoid the well known "threads and fork don't mix" problem). Once my library has been initialized, it is thread safe (and expected that the application level code may create threads). I am only concerned with supporting pthreads.

It seems impossible to count the number of threads in the current process space using pthreads. Indeed, even googletest only implements GetThreadCount() on Mac OS and QNX.

Given that I can't count the threads, is it somehow possible that I can instead assert a single threaded context?

Clarification: If possible, I would like to avoid using "/proc" (non-portable), an additional library dependency (like libproc) and LD_PRELOAD-style pthread_create wrappers.

Clarification #2: In my case using multiple processes is necessary as the workers in my library are relatively heavy weight (using webkit) and might crash. However, I want the original process to survive worker crashes.

like image 369
ahochhaus Avatar asked Dec 20 '12 19:12

ahochhaus


People also ask

Which argument of pthread_create is thread entry code?

pthread_t is the data type used to uniquely identify a thread. It is returned by pthread_create() and used by the application in function calls that require a thread identifier. The thread is created running start_routine, with arg as the only argument.

Are pthreads concurrent or parallel?

The Pthreads standard specifies concurrency; it allows parallelism to be at the option of system implementors. As a programmer, all you can do is define those tasks, or threads, that can occur concurrently.

How do I get thread ID?

In the run() method, we use the currentThread(). getName() method to get the name of the current thread that has invoked the run() method. We use the currentThread(). getId() method to get the id of the current thread that has invoked the run() method.

How do I run a thread program?

To execute the c file, we have to use the -pthread or -lpthread in the command line while compiling the file. Syntax: int pthread_create(pthread_t * thread, const pthread_attr_t * attr, void * (*start_routine)(void *), void *arg);


1 Answers

You could mark your library initialization function to be run prior to the application main(). For example, using GCC,

static void my_lib_init(void) __attribute__((constructor));

static void my_lib_init(void)
{
    /* ... */
}

Another option is to use posix_spawn() to fork and execute the worker processes as separate, slave binaries.

EDITED TO ADD:

It seems to me that if you wish to determine if the process has already created (actual, kernel-based) threads, you will have to rely on OS-specific code.

In the Linux case, the determination is simple, and safe to run on other OSes too. If it cannot determine the number of threads used by the current process, the function will return -1:

#include <unistd.h>
#include <sys/types.h>
#include <dirent.h>
#include <errno.h>

int count_threads_linux(void)
{
    DIR           *dir;
    struct dirent *ent;
    int            count = 0;

    dir = opendir("/proc/self/task/");
    if (!dir)
        return -1;

    while (1) {

        errno = 0;
        ent = readdir(dir);
        if (!ent)
            break;

        if (ent->d_name[0] != '.')
            count++;
    }

    if (errno) {
        const int saved_errno = errno;
        closedir(dir);
        errno = saved_errno;
        return -1;
    }

    if (closedir(dir))
        return -1;

    return count;
}

There are certain cases (like chroot without /proc/) when that check will fail even in Linux, so the -1 return value should always be treated as unknown rather than error (although errno will indicate the actual reason for the failure).

Looking at the FreeBSD man pages, I wonder if the corresponding information is available at all.

Finally:

Rather than try detecting the problematic case, I seriously recommend you fork() and exec() (or posix_spawn()) the slave processes, using only async-signal-safe functions (see man 7 signal) in the child process (prior to exec()), thus avoiding the fork()-thread complications. You can still create any shared memory segments, socket pairs, et cetera before forking(). The only drawback I can see is that you have to use separate binaries for the slave workers. Which, given your description of them, does not sound like a drawback to me.

like image 129
Nominal Animal Avatar answered Nov 15 '22 05:11

Nominal Animal