I have a programme using sem_wait
. The Posix specification says:
The
sem_wait()
function is interruptible by the delivery of a signal.
Additionally, in the section about errors it says:
[EINTR] - A signal interrupted this function.
However, in my programme, sending a signal does not unblock the call (and return -1
as indicated in the spec).
A minimal example can be found below. This programme hangs and sem_wait
never unblocks after the signal is sent.
#include <semaphore.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
sem_t sem;
void sighandler(int sig) {
printf("Inside sighandler\n");
}
void *thread_listen(void *arg) {
signal(SIGUSR1, &sighandler);
printf("sem_wait = %d\n", sem_wait(&sem));
return NULL;
}
int main(void) {
pthread_t thread;
sem_init(&sem, 0, 0);
pthread_create(&thread, NULL, &thread_listen, NULL);
sleep(1);
raise(SIGUSR1);
pthread_join(thread, NULL);
return 0;
}
The programme outputs Inside sighandler
then hangs.
There is another question here about this, but it doesn't really provide any clarity.
Am I misunderstanding what the spec says? FYI my computer uses Ubuntu GLIBC 2.31-0ubuntu9.
sem_wait() returns zero after completing successfully. Any other return value indicates that an error occurred.
Use sem_wait(3RT) to block the calling thread until the semaphore count pointed to by sem becomes greater than zero, then atomically decrement the count.
The sem_wait() function decrements by one the value of the semaphore. The semaphore will be decremented when its value is greater than zero. If the value of the semaphore is zero, then the current thread will block until the semaphore's value becomes greater than zero.
You can write thread safe code using primitives to protect global data with critical sections. Signal handlers can't rely on this. For example, you could be inside a critical section within sem_wait, and simultaneously do something that causes a segfault. This would break the thread-safe protections of sem_wait.
There are three reasons why this program doesn't behave as you expect, only two of which are fixable.
As pointed out in David Schwartz’s answer, in a multi-threaded program, raise
sends a signal to the thread that calls raise
.
To get the signal sent to the thread you wanted, in this test program, change the raise(SIGUSR1)
to pthread_kill(thread, SIGUSR1)
. However, if you want that specific thread to handle SIGUSR1
when it’s sent to the entire process, what you need to do is use pthread_sigmask
to block SIGUSR1
in all of the threads except the one that's supposed to handle it. (See below for more detail on this.)
On systems that use glibc, signal
installs a signal handler that does not interrupt blocking system calls. To get a signal handler that does, you need to use sigaction
and set sa_flags
to a value that doesn’t include SA_RESTART
. For instance,
struct sigaction sa;
sigemptyset(&sa.sa_mask);
sa.sa_handler = sighandler;
sa.sa_flags = 0;
sigaction(SIGUSR1, &sa, 0);
Note: memset(&sa, 0, sizeof sa)
is not guaranteed to have the same effect as sigemptyset(&sa.sa_mask)
.
Note: Signal handlers are process-global, so it doesn’t matter which thread you call sigaction
on. In almost all cases, multithreaded programs should do all their sigaction
calls in main
before creating any threads, just to make sure the signal handlers are active before any signals can happen.
The signal could be delivered to the thread before the thread has a chance to call sem_wait
. If that happens, the signal handler will be called and return, and then sem_wait
will be called and it will block forever. In this test program, you can make this arbitrarily unlikely by increasing the length of the sleep
in main
, but there is no way to make it impossible. This is the unfixable reason.
There are a small number of system calls that atomically unblock signals while sleeping, and then block them again before returning to user space, such as sigsuspend
, sigwaitinfo
, and pselect
. These are the only system calls for which this race condition can be avoided.
Best practice for a multi-threaded program that has to deal with signals is to have one thread devoted to signal handling. To make that work reliably, you should block all signals except for synchronous CPU exceptions (SIGABRT
, SIGBUS
, SIGFPE
, SIGILL
, SIGSEGV
, SIGSYS
, and SIGTRAP
) at the very beginning of main
, before creating any threads. Then you set a do-nothing signal handler (with SA_RESTART
) for the signals you want to handle; these will never actually be called, their purpose is to prevent the kernel from killing the process due to the default action of SIGUSR1
or whatever. The set of signals you care about must include all of the signals for user interrupts: SIGHUP
, SIGINT
, SIGPWR
, SIGQUIT
, SIGTERM
, SIGTSTP
, SIGXCPU
, SIGXFSZ
. Finally, you create the signal-handling thread, which loops calling sigwaitinfo
for the appropriate set of signals, and dispatches messages to the rest of the threads using pipes or condition variables or anything but signals really. This thread must never block in any system call other than sigwaitinfo
.
In the case of this test program, the signal-handling thread would respond to SIGUSR1
by calling sem_post(&sem)
. This would either wake up the listener thread, or it would cause the listener thread not to become blocked on sem_wait
in the first place.
In a multi-threaded program, raise
sends a signal to the thread that calls raise
. You need to use kill(getpid(), ...)
or pthread_signal(thread, ...)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With