Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why sleep() after acquiring a pthread_mutex_lock will block the whole program?

In my test program, I start two threads, each of them just do the following logic:

    1) pthread_mutex_lock()
    2) sleep(1)
    3) pthread_mutex_unlock()

However, I find that after some time, one of the two threads will block on pthread_mutex_lock() forever, while the other thread works normal. This is a very strange behavior and I think maybe a potential serious issue. By Linux manual, sleep() is not prohibited when a pthread_mutex_t is acquired. So my question is: is this a real problem or is there any bug in my code ?

The following is the test program. In the code, the 1st thread's output is directed to stdout, while the 2nd's is directed to stderr. So we can check these two different output to see whether the thread is blocked.

I have tested it on linux kernel (2.6.31) and (2.6.9). Both results are the same.

//=======================  Test Program  ===========================
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <pthread.h>

#define THREAD_NUM 2
static int data[THREAD_NUM];
static int sleepFlag = 1;

static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
static void * threadFunc(void *arg)
{
        int* idx = (int*) arg;
        FILE* fd = NULL;
        if (*idx == 0)  
                fd = stdout;
  else
          fd = stderr;

        while(1) {
                fprintf(fd, "\n[%d]Before pthread_mutex_lock is called\n", *idx);
    if (pthread_mutex_lock(&mutex) != 0) {
                exit(1);            
    }
          fprintf(fd, "[%d]pthread_mutex_lock is finisheded. Sleep some time\n", *idx);
          if (sleepFlag == 1)
             sleep(1);
          fprintf(fd, "[%d]sleep done\n\n", *idx);


          fprintf(fd, "[%d]Before pthread_mutex_unlock is called\n", *idx);    
    if (pthread_mutex_unlock(&mutex) != 0) {
                exit(1);
    }
    fprintf(fd, "[%d]pthread_mutex_unlock is finisheded.\n", *idx);    
        }
}

// 1. compile
//    gcc -o pthread pthread.c -lpthread
// 2. run
//    1) ./pthread sleep 2> /tmp/error.log     # Each thread will sleep 1 second after it acquires pthread_mutex_lock
//       ==> We can find that /tmp/error.log will not increase.
//    or
//    2) ./pthread nosleep 2> /tmp/error.log   # No sleep is done when each thread acquires pthread_mutex_lock
//       ==> We can find that both stdout and /tmp/error.log increase.

int main(int argc, char *argv[]) {
          if ((argc == 2) && (strcmp(argv[1], "nosleep") == 0))
                  {
                          sleepFlag = 0;
                  }
    pthread_t t[THREAD_NUM];

    int i;
    for (i = 0; i < THREAD_NUM; i++) {
      data[i] = i;
      int ret = pthread_create(&t[i], NULL, threadFunc, &data[i]);
      if (ret != 0) {
        perror("pthread_create error\n");
        exit(-1);
      }
    }   

    for (i = 0; i < THREAD_NUM; i++) {
      int ret = pthread_join(t[i], (void*)0);
      if (ret != 0) {
        perror("pthread_join error\n");
        exit(-1);
      }
    }

    exit(0);
}

This is the output:

On the terminal where the program is started:

    root@skyscribe:~# ./pthread sleep 2> /tmp/error.log

    [0]Before pthread_mutex_lock is called
    [0]pthread_mutex_lock is finisheded. Sleep some time
    [0]sleep done

    [0]Before pthread_mutex_unlock is called
    [0]pthread_mutex_unlock is finisheded.
    ...

On another terminal to see the file /tmp/error.log

    root@skyscribe:~# tail -f /tmp/error.log 

    [1]Before pthread_mutex_lock is called

And no new lines are outputed from /tmp/error.log

like image 929
user1040933 Avatar asked Feb 23 '23 10:02

user1040933


1 Answers

This is a wrong way to use mutexes. A thread should not hold a mutex for more time than it does not own it, particularly not if it sleeps while holding the mutex. There is no FIFO guarantee for locking a mutex (for efficiency reasons).

More specifically, if thread 1 unlocks the mutex while thread 2 is waiting for it, it makes thread 2 runnable but this does not force the scheduler to preempt thread 1 or make thread 2 run immediately. Most likely, it will not because thread 1 has recently slept. When thread 1 subsequently reaches the pthread_mutex_lock() call, it will generally be allowed to lock the mutex immediately, even though there is a thread waiting (and the implementation can know it). When thread 2 wakes up after that, it will find the mutex already locked and go back to sleep.

The best solution is not to hold a mutex for that long. If that is not possible, consider moving the lock-needing operations to a single thread (removing the need for the lock) or waking up the correct thread using condition variables.

like image 58
jilles Avatar answered Feb 26 '23 19:02

jilles