Is access to a static function variable slower than access to a global variable?

Tags:

Static local variables are initialised on the first function call:

Variables declared at block scope with the specifier static have static storage duration but are initialized the first time control passes through their declaration (unless their initialization is zero- or constant-initialization, which can be performed before the block is first entered). On all further calls, the declaration is skipped.

Also, in C++11 there are even more checks:

If multiple threads attempt to initialize the same static local variable concurrently, the initialization occurs exactly once (similar behavior can be obtained for arbitrary functions with std::call_once). Note: usual implementations of this feature use variants of the double-checked locking pattern, which reduces runtime overhead for already-initialized local statics to a single non-atomic boolean comparison. (since C++11)

At the same time, global variables seem to be initialised on program start (though technically only allocation/deallocation is mentioned on cppreference):

static storage duration. The storage for the object is allocated when the program begins and deallocated when the program ends. Only one instance of the object exists. All objects declared at namespace scope (including global namespace) have this storage duration, plus those declared with static or extern.

So given the following example:

struct A {
    // complex type...
};
const A& f()
{
    static A local{};
    return local;
}

A global{};
const A& g()
{
    return global;
}

am I correct to assume that f() has to check whether its variable was initialised every time it is called and thus f() will be slower than g()?

335

asked Sep 06 '18 07:09

Dev Null

3 Answers

You are conceptually correct of course, but contemporary architectures can deal with this.

A modern compiler and architecture would arrange the pipeline such that the already-initialised branch was assumed. The overhead of initialisation would therefore incur an extra pipeline dump, that's all.

If you're in any doubt, check the assembly.

169

answered Oct 19 '22 21:10

Bathsheba

Yes, it is almost certainly slightly slower. Most of the time it will however not matter and the cost will be outweighted by the "logic and style" benefit.

Technically, a function-local static variable is the same as a global variable. Only just that its name is not globally known (which is a good thing), and its initialization is guaranteed to happen not only at an exactly specified time, but also only once, and threadsafe.

This means that a function-local static variable must know whether initialization has happened, and thus needs at least one extra memory access and one conditional jump that the global (in principle) doesn't need. An implemenation may do someting similar for globals, but it needs not (and usually doesn't).

Chances are good that the jump is predicted correctly in all cases but two. The first two calls are highly likely to be predicted wrong (usually jumps are by default assumed to be taken rather than not, wrong assumption on first call, and subsequent jumps are assumed to take the same path as the last one, again wrong). After that, you should be good to go, near 100% correct prediction.
But even a correctly predicted jump isn't free (the CPU can still only start a given number of instructions every cycle, even assuming they take zero time to complete), but it's not much. If the memory latency, which may be a couple of hundred cycles in the worst case can be successfully hidden, the cost almost disappears in pipelining. Also, every access fetches an extra cacheline that wouldn't otherwise be needed (the has-been-initialized flag likely isn't stored in the same cache line as the data). Thus, you have slightly worse L1 performance (L2 should be big enough so you can say "yeah, so what").

It also needs to actually perform something once and threadsafe that the global (in principle) doesn't have to do, at least not in a way that you see. An implementation can do something different, but most just initialize globals before main is entered, and not rarely most of it is done with a memset or implicitly because the variable is stored in a segment that is zeroed anyway.
Your static variable must be initialized when the initialization code is executed, and it must happen in a threadsafe manner. Depending on how much your implementation sucks this can be quite expensive. I decided to forfeit on the thread safety feature and always compile with fno-threadsafe-statics (even if this isn't standard-compliant) after discovering that GCC (which is otherwise an OK allround compiler) would actually lock a mutex for every static initialization.

answered Oct 19 '22 23:10

Damon

From https://en.cppreference.com/w/cpp/language/initialization

Deferred dynamic initialization
It is implementation-defined whether dynamic initialization happens-before the first statement of the main function (for statics) or the initial function of the thread (for thread-locals), or deferred to happen after.

If the initialization of a non-inline variable (since C++17) is deferred to happen after the first statement of main/thread function, it happens before the first odr-use of any variable with static/thread storage duration defined in the same translation unit as the variable to be initialized.

So similar check may have to be done also for global variables.

so f() is not necessary "slower" than g().

answered Oct 19 '22 22:10

Jarod42

Related questions
                            
                                Visual Studio Code Python debugging "Exception has occurred SystemExit"
                            
                                Why is summing an array of value types slower then summing an array of reference types?
                            
                                How to specify version ranges in Conda environment.yml
                            
                                STL container with a specific type as a generic argument
                            
                                How to use AMD GPU for fastai/pytorch?
                            
                                Flutter sign in with Apple not working on iOS simulator (infinite loader)
                            
                                Clarification on difference in ODR rules for structs in C and C++
                            
                                Best way to stress test a rails web app? [closed]
                            
                                How to turn off ReSharper's "Find All Usages"
                            
                                Neural networks - input values
                            
                                How to automate migration (schema and data) for PHP/MySQL application
                            
                                Gurus say that LD_LIBRARY_PATH is bad - what's the alternative?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is access to a static function variable slower than access to a global variable?

Tags:

c++

global-variables

static