I was solving an exercise online, and at one point i needed to delete the "" from a the beginning and end of a string. This was my code:
void static inline process_value(std::string &value) {
if (value.back() !='>') {
value = value.substr(1, value.size()-2);
}
}
Called from this benchmark loop:
static void UsingStatic(benchmark::State& state) {
// Code inside this loop is measured repeatedly
for (auto _ : state) {
std::string valor("\"Hola\"");
process_valueS(valor);
// Make sure the variable is not optimized away by compiler
benchmark::DoNotOptimize(valor);
}
}
Just because of curiosity I did a benchmark.
While I was at it I decided to remove static
from process_value
, making void inline process_value
that was otherwise the same. To my surprise it was slower.
I thought that static only meant that the function was just for a file. But here it says that " 'static' means that the function should be inlined by the compiler if possible". But in that case when i removed static I think that the result should not have changed. Now I'm am confused, what other things does static do other than delimiting the function to a single .cpp
, how does that affect performance?
The disassembly on QuickBench shows that the NoUsingStatic
loop actually calls process_value
instead of inlining it, despite the inline
keyword making it legal for the compiler to do so. But UsingStatic
does inline the call to process_valueS
. That difference in compiler decision-making presumably explains the difference in performance, but why would clang choose not to inline a simple function declared void inline process_value(std::string &value){ ... }
?
EDIT: Beacuse the question was closed because it was not clear enough, i deleted parts that where not related to the question. But if im missing some information please tell me in the comments
Regarding to the static keyword, if it's applied to a global variable, then it has the file-scope (as you've mentioned) if you compile your code as a separate compilation-unit. So it's even possible to have your static global variables accessible from other files if you compile them as a single compilation unit.
Static variables are fast to access. Small constant data may not need any allocation at all as the compiler may use the value directly in code when it needs it. Stack data: Are allocated in the call directly on the stack. This is very fast but not as fast as pre allocated static data.
Indexing a static variable ( Math["PI"] ) is far and away the slowest way you can access a field. Avoid it at all costs in performance-critical code.
Clang uses a cost based decision whether a function will be inlined or not. This cost is affected by a lot of things. It is affected by static
.
Fortunately, clang has an output, where we can observe this. Check out this godbolt link:
void call();
inline void a() {
call();
}
static inline void b() {
call();
}
void foo() {
a();
b();
}
In this little example, a()
and b()
are the same, the only exception is that b()
is static.
If you move the mouse over the calls a()
or b()
on godbolt (in OptViewer
window), you can read:
a()
: cost=0, threshold=487
b()
: cost=-15000, threshold=487
(clang will inline a call, if the cost is less than the threshold.)
clang gave b()
a much lower cost, because it is static. It seems that clang will only give this -15000 cost reduction for a static function only once. If b()
is called several times, the cost of all b()
s will be zero, except one.
Here are the numbers for your case, link:
process_value():
cost=400, threshold=325 -> it is just above the threshold, won't be inlined
process_valueS():
: cost=-14600, threshold=325 -> OK to inline
So, apparently, static
can have a lot of impact, if it is only called once. Which makes sense, because inlining a static function once doesn't increase code size.
Tip: if you want to force clang to inline a function, use __attribute__((always_inline))
on it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With