Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Catching signals such as SIGSEGV and SIGFPE in multithreaded program

I am trying to write a multithreaded logging system for a program running on linux.

Calls to the logging system in the main program threads pushes a data structure containing the data to be logged into a FIFO queue. A dedicated thread picks the data of the queue and outputs the data, while the programs main thread continues with its task.

If the main program causes SIGSEGV or other signals to be raised I need to make sure that the queue is empty before terminating.

My plan is to block the signals using pthread_sigmask http://man7.org/linux/man-pages/man3/pthread_sigmask.3.html for all but one thread, but reading the list of signals on http://man7.org/linux/man-pages/man7/signal.7.html i noticed:

A signal may be generated (and thus pending) for a process as a whole (e.g., when sent >using kill(2)) or for a specific thread (e.g., certain signals, such as SIGSEGV and SIGFPE, >generated as a consequence of executing a specific machine-language instruction are thread directed, as are signals targeted at a specific thread using pthread_kill(3)).

If I block SIGSEGV on all threads but a thread dedicated to catching signals, will it then catch a SIGSEGV raised by a different thread?

I found the question Signal handling with multiple threads in Linux, but I am clueless as to which signals are thread specific and how to catch them.

like image 940
Bo M. Petersen Avatar asked Nov 30 '13 19:11

Bo M. Petersen


People also ask

When a multi threaded process receives a signal to what thread should that signal be delivered?

A process-directed signal may be delivered to any one of the threads that does not currently have the signal blocked. If more than one of the threads has the signal unblocked, then the kernel chooses an arbitrary thread to which to deliver the signal.

Are signals shared between threads?

A signal mask is associated with each thread. The list of actions associated with each signal number is shared among all threads in the process. If the signal action specifies termination, stop, or continue, the entire process, thus including all its threads, is respectively terminated, stopped, or continued.

What is the difference between signal and sigaction?

The signal() function does not (necessarily) block other signals from arriving while the current handler is executing; sigaction() can block other signals until the current handler returns.

Are signal handlers per thread or per process?

Signal management in multithreaded processes is shared by the process and thread levels, and consists of the following: Per-process signal handlers. Per-thread signal masks. Single delivery of each signal.


1 Answers

I agree with the comments: in practice catching and handling SIGSEGV is often a bad thing.

And SIGSEGV is delivered to a specific thread (see this), the one running the machine instruction which accessed to some illegal address.

So you cannot run a thread dedicated to catching SIGSEGV in other threads. And you probably could not easily use signalfd(2) for SIGSEGV...

Catching (and returning normally from its signal handler) SIGSEGV is a complex and processor specific thing (it cannot be "portable C code"). You need to inspect and alter the machine state in the handler, that is either modify the address space (by calling mmap(2) etc...) or modify the register state of the current thread. So use sigaction(2) with SA_SIGINFO and change the machine specific state pointed by the third argument (of type ucontext_t*) of the signal handler. Then dive into the processor specific uc_mcontext field of it. Have fun changing individual registers, etc... If you don't alter the machine state of the faulty thread, execution is resumed (after returning from your SIGSEGV handler) in the same situation as before, and another SIGSEGV signal is immediately sent.... Or simply, don't return normally from a SIGSEGV handler (e.g. use siglongjmp(3) or abort(3) or _exit(2) ...).

Even if you happen to do all this, it is rumored that Linux kernels are not extremely efficient on such executions. So it is rumored that trying to mimic Hurd/Mach external pagers this way on Linux is not very efficient. See this answer...

Of course signal handlers should call only (see signal(7) for more) async-signal-safe functions. In particular, you cannot in principle call fprintf from them (and you might not be able to use reliably your logging system, but it could work in most but not all cases).

What I said on SIGSEGV also holds for SIGBUS and SIGFPE (and other thread-specific asynchronous signals, if they exist).

like image 125
Basile Starynkevitch Avatar answered Oct 14 '22 04:10

Basile Starynkevitch