Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does initialization of local static objects use hidden guard flags?

Local static objects in C++ are initialized once, the first time they are needed (which is relevant if the initialization has a side effect):

void once() {
    static bool b = [] {
        std::cout << "hello" << std::endl; return true;
    } ();
}

once will print "hello" the first time it is called, but not if it is called again.

I've put a few variations of this pattern into Compiler Explorer and noticed that all of the big-name implementations (GCC, Clang, ICC, VS) essentially do the same thing: a hidden variable guard variable for once()::b is created, and checked to see whether the primary variable needs to be initialized "this time"; if it does, it gets initialized and then the guard is set, and next time it won't jump out to the initialization code. e.g. (minimized by replacing the lambda with a call to extern bool init_b();):

once():
        movzx   eax, BYTE PTR guard variable for once()::b[rip]
        test    al, al
        je      .L16
        ret
.L16:
        push    rbx
        mov     edi, OFFSET FLAT:guard variable for once()::b
        call    __cxa_guard_acquire
        test    eax, eax
        jne     .L17
        pop     rbx
        ret
.L17:
        call    init_b()
        pop     rbx
        mov     edi, OFFSET FLAT:guard variable for once()::b
        jmp     __cxa_guard_release
        mov     rbx, rax
        mov     edi, OFFSET FLAT:guard variable for once()::b
        call    __cxa_guard_abort
        mov     rdi, rbx
        call    _Unwind_Resume

...from GCC 6.3 with -O3.

This isn't unreasonable, and I know that in practice conditional jumps are close to free anyway when the condition is consistent. However, my gut feeling would still have been to implement this by unconditionally jumping to the initialization code, which as its last action overwrites the originating jump with nop instructions. Not necessarily an option on every platform, but the x86 family seems quite liberal about what you can read or write, and where.

What's so wrong with this apparently-simple idea that no mainstream compiler uses it? (Or do I just need to try harder with my examples?)

like image 771
Leushenko Avatar asked Apr 14 '17 14:04

Leushenko


2 Answers

This sort of "optimization" is not safe in a multithreaded environment, and may not be safe even in a single one.

The writing of "nops" could likely take multiple instructions.

The size of the jmp instruction may not be knowable until the final code is optimized (does it need an 8, 16, or 32 bit offset?)

Instruction caching within the CPU does not pick up on a change in code bytes unless one of a subset of instructions is executed to cause the caches to be flushed.

And all that is assuming the code can be written to via the data segment.

like image 144
1201ProgramAlarm Avatar answered Nov 05 '22 23:11

1201ProgramAlarm


On most modern operating systems modifying the code loaded with the program causes issues. This can both cause performance issues (Unmodified code can share pages between many instances of a dll on some systems), and security issues (preventing the use of executable space protection technologies).

like image 33
user1937198 Avatar answered Nov 05 '22 22:11

user1937198