Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the maximum number of threads that pthread_create can create? [duplicate]

Tags:

c

linux

pthreads

Possible Duplicate:
The thread create by pthread_create the same with the kernel thread?

I use the code below to test the maximum number of threads that the pthread_create function can create.

#include <pthread.h>
#include <stdio.h>

static unsigned long long thread_nr = 0;

pthread_mutex_t mutex_;

void* inc_thread_nr(void* arg) {
    (void*)arg;
    pthread_mutex_lock(&mutex_);
    thread_nr ++;
    pthread_mutex_unlock(&mutex_);

    /* printf("thread_nr = %d\n", thread_nr); */

    sleep(300000);
}

int main(int argc, char *argv[])
{
    int err;
    int cnt = 0;

    pthread_t pid[1000000];

    pthread_mutex_init(&mutex_, NULL);

    while (cnt < 1000000) {

        err = pthread_create(&pid[cnt], NULL, (void*)inc_thread_nr, NULL);
        if (err != 0) {
            break;
        }
        cnt++;
    }

    pthread_join(pid[cnt], NULL);

    pthread_mutex_destroy(&mutex_);
    printf("Maximum number of threads per process is = %d\n", thread_nr);
}

And the output is :

Maximum number of threads per process is = 825

Is that the maximum number of threads that the pthread_create function can create?

Besides, I use the command below to view the maximum number of threads my system allows:

# cat /proc/sys/kernel/threads-max

And the number is 772432.

Why is the output of my program not equal to the value of threads-max ?

My OS is Fodaro 16, with 12 cores, 48G RAM.

like image 465
injoy Avatar asked Sep 12 '12 12:09

injoy


1 Answers

The default size for the per-thread stack is artificially imposing the limit in your test. While the default stack given to the process (the initial thread) grows dynamically as needed, the stacks for the other threads are fixed in size. The default size is usually extremely large, something like two megabytes, to make sure the per-thread stack is large enough for even the pathological cases (deep recursion and so on).

In most cases, thread workers need very little stack. I've found that on all architectures I've used, 64k (65536 bytes) per-thread stack is sufficient, as long as I don't use deeply recursive algorithms or large local variables (structures or arrays).

To explicitly specify a per-thread stack size, modify your main() to something like the following:

#define MAXTHREADS 1000000
#define THREADSTACK  65536

int main(int argc, char *argv[])
{
    pthread_t       pid[MAXTHREADS];
    pthread_attr_t  attrs;
    int  err, i;
    int  cnt = 0;

    pthread_attr_init(&attrs);
    pthread_attr_setstacksize(&attrs, THREADSTACK);

    pthread_mutex_init(&mutex_, NULL);

    for (cnt = 0; cnt < MAXTHREADS; cnt++) {

        err = pthread_create(&pid[cnt], &attrs, (void*)inc_thread_nr, NULL);
        if (err != 0)
            break;
    }

    pthread_attr_destroy(&attrs);

    for (i = 0; i < cnt; i++)
        pthread_join(pid[i], NULL);

    pthread_mutex_destroy(&mutex_);

    printf("Maximum number of threads per process is %d (%d)\n", cnt, thread_nr);
}

Note that attrs is not consumed by the pthread_create() call. Think of the thread attributes more like a template on how pthread_create() should create the threads; they are not attributes given to the thread. This trips up many aspiring pthreads programmers, so it's one of those things you'd better get right from the get go.

As to the stack size itself, it must be at least PTHREAD_STACK_MIN (16384 in Linux, I believe) and divisible by sysconf(_SC_PAGESIZE). Since page size is a power of two on all architectures, using a large enough power of two should always work.

Also, I added a fix in there, too. You only try to join a nonexistent thread (the one that the loop tried to create, but failed), but you need to join all of them (to make sure they're all done their job).

Further recommended fixes:

Instead of using a sleep, use a condition variable. Have each thread wait (pthread_cond_wait()) on the condition variable (while holding the mutex), then release the mutex and exit. That way your main function only needs to broadcast (pthread_cond_broadcast()) on the condition variable to tell all threads they can now exit, then it can join each one, and you can be sure that that number of threads were really concurrently running. As your code stands now, some threads may have enough time to wake up from the sleep and exit.

like image 116
Nominal Animal Avatar answered Oct 16 '22 22:10

Nominal Animal