Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is gcc doing here to run this code once per thread?

I just ran across this technique for running code once per thread. I don't know how it works at the lowest level though. Especially, what's fs pointing to? What does .zero 8 mean? Is there a reason the identifier is @tpoff?

int foo();

void bar()
{
    thread_local static auto _ = foo();
}

Output (with -O2):

bar():
        cmp     BYTE PTR fs:guard variable for bar()::_@tpoff, 0
        je      .L8
        ret
.L8:
        sub     rsp, 8
        call    foo()
        mov     BYTE PTR fs:guard variable for bar()::_@tpoff, 1
        add     rsp, 8
        ret
guard variable for bar()::_:
        .zero   8
like image 317
Artikash-Reinstate Monica Avatar asked Jan 19 '19 18:01

Artikash-Reinstate Monica


People also ask

Is GCC single threaded?

The compiler isn't multi-threaded, as there isn't much potential for concurrency there (it's a sequential job mostly, you can only do one step at a time, and the individual steps usually aren't suited to be processed in parallel).

What is __ thread in C?

The __thread storage class marks a static variable as having thread-local storage duration. This means that, in a multi-threaded application, a unique instance of the variable is created for each thread that uses it, and destroyed when the thread terminates.

What is FNO common?

The default is -fno-common , which specifies that the compiler places uninitialized global variables in the BSS section of the object file.


1 Answers

The fs segment base is the address of thread-local storage (on x86-64 Linux at least).

.zero 8 reserves 8 bytes of zeros (presumably in the BSS). Check the GAS manual: https://sourceware.org/binutils/docs/as/Zero.html, links in https://stackoverflow.com/tags/x86/info.

@tpoff presumably means to address it relative to thread-local storage, probably stands for thread something offset, I don't know.


The rest of it looks similar to what gcc normally does for static local variables that need a runtime initializer: a guard variable that it checks every time it enters the function, falling through in the already-initialized case.

The 1-byte guard variable is in thread-local storage. The actual _ itself is optimized away because it's never read. Notice there's no store of eax after foo returns.

BTW, _ is a weird (bad) choice for a variable name. Easy to miss it, and probably reserved for use by the implementation.


It has a nice optimization here: normally (for non-thread-local static int var = foo();) if it finds the guard variable isn't already initialized, it needs a thread-safe way to make sure only one thread actually does the initialization (essentially taking a lock).

But here each thread has its own guard variable (and should run foo() the first time regardless of what other threads are doing) so it doesn't need to call a run_once function to get mutual exclusion.

(sorry for the short answer, I may expand this later with an example on https://godbolt.org/ of a non-thread-local static local variable. Or find an SO Q&A about it.)

like image 175
Peter Cordes Avatar answered Oct 14 '22 06:10

Peter Cordes