I have a threadpool of workers. Each worker executes this routine:
void* worker(void* args){
...
pthread_mutex_lock(&mtx);
while (queue == NULL && stop == 0){
pthread_cond_wait(&cond, &mtx);
}
el = pop(queue);
pthread_mutex_unlock(&mtx);
...
}
main thread:
int main(){
...
while (stop == 0){
...
pthread_mutex_lock(&mtx);
insert(queue, el);
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mtx);
...
}
...
}
Then I have a signal handler that executes this code when it receives a signal:
void exit_handler(){
stop = 1;
pthread_mutex_lock(&mtx);
pthread_cond_broadcast(&cond);
pthread_mutex_unlock(&mtx);
}
I have omitted declarations and initialization, but the original code has them.
After a signal is received most of the time it's all ok, but sometimes it seems that some worker threads stay in the wait loop because they don't see that the variable stop is changed and/or they are not waken up by the broadcast.
So the threads never end. What I am missing?
EDIT: stop=1 moved inside the critical section in exit_handler. The issue remains.
EDIT2: I was executing the program on a VM with Ubuntu. Since the code appears to be totally right I tried to change VM and OS (XUbuntu) and now it seems to work correctly. Still don't know why, anyone has an idea?
Some guessing here, but it's too long for a comment, so if this is wrong, I will delete. I think you may have a misconception about how pthread_cond_broadcast works (at least something I've been burned with in the past). From the man page:
The
pthread_cond_broadcast()function shall unblock all threads currently blocked on the specified condition variable cond.
Ok, that make sense, _broadcast awakens all threads currently blocked on cond. However, only one of the awakened threads will then be able to lock the mutex after they're all awoken. Also from the man page:
The thread(s) that are unblocked shall contend for the mutex according to the scheduling policy (if applicable), and as if each had called
pthread_mutex_lock().
So this means that if 3 threads are blocked on cond and _broadcast is called, all 3 threads will wake up, but only 1 can grab the mutex. The other 2 will still be stuck in pthread_cond_wait, waiting on a signal. Because of this, they don't see stop set to 1, and exit_handler (I'm assuming a Ctrl+c software signal?) is done signaling, so the remaining threads that lost the _broadcast competition are stuck in limbo, waiting on a signal that will never come, and unable to read that the stop flag has been set.
I think there are 2 options to work-around/fix this:
pthread_cond_timedwait. Even without being signaled, this will return from waiting at the specified time interval, see that stop == 1, and then exit.pthread_cond_signal or pthread_cond_broadcast at the end of your worker function. This way, right before a thread exits, it will signal the cond variable allowing any other waiting threads to grab the mutex and finish processing. There is no harm in signaling a conditional variable if no threads are waiting on it, so this should be fine even for the last thread.EDIT: Here is an MCVE that proves (as far as I can tell) that my answer above is wrong, heh. As soon as I press Ctrl+c, the program exits "immediately", which says to me all the threads are quickly acquiring the mutex after the broadcast, seeing that stop is false, and exiting. Then main joins on the threads and it's process over.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <stdbool.h>
#include <signal.h>
#include <unistd.h>
#define NUM_THREADS 3
#define STACK_SIZE 10
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t c = PTHREAD_COND_INITIALIZER;
volatile bool stop = false;
int stack[STACK_SIZE] = { 0 };
int sp = 0; // stack pointer,, also doubles as the current stack size
void SigHandler(int sig)
{
if (sig == SIGINT)
{
stop = true;
}
else
{
printf("Received unexcepted signal %d\n", sig);
}
}
void* worker(void* param)
{
long tid = (long)(param);
while (stop == false)
{
// acquire the lock
pthread_mutex_lock(&m);
while (sp <= 0) // sp should never be < 0
{
// there is no data in the stack to consume, wait to get signaled
// this unlocks the mutex when it is called, and locks the
// mutex before it returns
pthread_cond_wait(&c, &m);
}
// when we get here we should be guaranteed sp >= 1
printf("thread %ld consuming stack[%d] = %d\n", tid, sp-1, stack[sp-1]);
sp--;
pthread_mutex_unlock(&m);
int sleepVal = rand() % 10;
printf("thread %ld sleeping for %d seconds...\n", tid, sleepVal);
sleep(sleepVal);
}
pthread_exit(NULL);
}
int main(void)
{
pthread_t threads[NUM_THREADS];
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
srand(time(NULL));
for (long i=0; i<NUM_THREADS; i++)
{
int rc = pthread_create(&threads[i], &attr, worker, (void*)i);
if (rc != 0)
{
fprintf(stderr, "Failed to create thread %ld\n", i);
}
}
while (stop == false)
{
// produce data in bursts
int numValsToInsert = rand() % (STACK_SIZE - sp);
printf("main producing %d values\n", numValsToInsert);
// acquire the lock
pthread_mutex_lock(&m);
for (int i=0; i<numValsToInsert; i++)
{
// produce values for the stack
int val = rand() % 10000;
// I think this should already be guaranteed..?
if (sp+1 < STACK_SIZE)
{
printf("main pushing stack[%d] = %d\n", sp, val);
stack[sp++] = val;
// signal the workers that data is ready
//printf("main signaling threads...\n");
//pthread_cond_signal(&c);
}
else
{
printf("stack full!\n");
}
}
pthread_mutex_unlock(&m);
// signal the workers that data is ready
printf("main signaling threads...\n");
pthread_cond_broadcast(&c);
int sleepVal = 1;//rand() % 5;
printf("main sleeping for %d seconds...\n", sleepVal);
sleep(sleepVal);
}
for (long i=0; i<NUM_THREADS; i++)
{
pthread_join(threads[i], NULL);
}
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With