I have such code:
struct Storage
{
static int GetData()
{
static int global_value;
return global_value++;
}
};
int free_func()
{
static int local_value;
return local_value++;
}
int storage_test_func()
{
return Storage::GetData();
}
Compiling it on OSX:
$ clang++ 1.cpp -shared
And running nm:
$ nm | c++filt
I am getting strange results:
0000000000000f50 T storage_test_func()
0000000000000f30 T free_func()
0000000000000f60 unsigned short Storage::GetData()
0000000000001024 bool free_func()::local_value
0000000000001020 D Storage::GetData()::global_value
U dyld_stub_binder
Two symbols (local_value
and global_value
) have different linkage! One visible difference is that global_value
is defined in static member function and local_value
is defined in free function.
Can somebody explain why it happens?
UPD:
After comments reading it looks like I should clarify things.
Maybe using of c++filt
was a bad idea. Without it it shows:
$ nm
0000000000000f50 T __Z17storage_test_funcv
0000000000000f30 T __Z9free_funcv
0000000000000f60 t __ZN7Storage7GetDataEv
0000000000001024 b __ZZ9free_funcvE11local_value
0000000000001020 D __ZZN7Storage7GetDataEvE5value
U dyld_stub_binder
So yes. __ZZ9free_funcvE11local_value
goes to BSS and __ZZN7Storage7GetDataEvE5value
goes to data section.
man nm
says that:
If the symbol is local (non-external), the symbol's type is instead represented by the corresponding lowercase letter.
And that what I see. __ZZ9free_funcvE11local_value
is marked with lowercase b
and __ZZN7Storage7GetDataEvE5value
is marked with capital D
.
And this is the main part of the question. Why it happens like that?
UPD2 One more way:
$ clang++ -c -emit-llvm 1.cpp
$ llvm-dis 1.bc
shows how these variables are represented internally:
@_ZZ9free_funcvE11local_value = internal global i32 0, align 4
@_ZZN7Storage7GetDataEvE5value = global i32 0, align 4
UPD3
Also there were some concerns about different sections symbols belong to. Putting __ZZ9free_funcvE11local_value
to the text section does not change its visibility:
struct Storage
{
static int GetData()
{
static int value;
return value++;
}
};
int free_func()
{
static int local_value = 123;
return local_value++;
}
int storage_test_func()
{
return Storage::GetData();
}
Compiling:
$ clang++ 1.cpp -shared
Checking:
$ nm
Gives:
0000000000000f50 T __Z17storage_test_funcv
0000000000000f30 T __Z9free_funcv
0000000000000f60 t __ZN7Storage7GetDataEv
0000000000001020 d __ZZ9free_funcvE11local_value
0000000000001024 D __ZZN7Storage7GetDataEvE5value
Now both symbols are in data section but still one of those is local and another is global. And the question is why it happens? Can somebody the logic on such compiler's decision?
A variable declared static within a module (but outside the body of a function) is accessible by all functions within that module. However, it is not accessible by functions from other modules. static members exist as members of the class rather than as an instance in each object of the class.
Inside a function, a normal variable is destroyed when the function exits. A static variable in a function retains its value even after the function exits.
When a global variable is declared with a static keyword, then it is known as a static global variable. It is declared at the top of the program, and its visibility is throughout the program. When a function is declared with a static keyword known as a static function. Its lifetime is throughout the program.
There is no difference in a local static variable in static member function, and a local static variable in a free function.
Two symbols (local_value and global_value) have different linkage!
In standard nomenclature and from point of view of the standard, both of the variables have no linkage.
The relevant difference between the functions is not that one is a static member function and the other is not. The relevant difference is that the former is an inline function, but the latter is not.
Non-inline functions can only be defined in one translation unit, and therefore their local static variables do not need to be accessible from other translation units.
Inline functions on the other hand must be defined in every translation unit that use them. And, since the local static must refer to the same object globally, that object must be visible to multiple translation units.
The values will be linked in different storage sections.
000000000001020 D Storage::GetData()::global_value
The D
shows that your variable will be linked to a section which will be initialized. From the nm
man page:
"D"
"d" The symbol is in the initialized data section.
The local_value
will not be initialized via the c-startup code.
After linking I got:
0804a0d0 b free_func()::local_value
0804a0d8 u Storage::GetData()::global_value
Also from nm
man page:
"B"
"b" The symbol is in the uninitialized data section (known as BSS).
"U" The symbol is undefined.
"u" The symbol is a unique global symbol. This is a GNU extension to the standard set of ELF symbol bindings. For such a symbol the
dynamic linker will make sure that in the entire process there is just one symbol with this name and type in use.
The reason is simply that the values will initialized via the standard startup initialization ( data section ) or not ( bss section ). The names for the section are not generally specified but used in common implementations.
You can find a lot Q&A to "When and where will which var initialized or not". e.g.: When are static and global variables initialized?
Clarification: ( I hope )
That a variable is not initialized via startup code did not mean that is not initialized. A static variable inside a method/function is typically initialized in the first access of the containing code block. If you want to know how your compiler do this job, have a look in the assembly output.
For gcc you will find some __cxa_
labels around the value initialization which protects your vars from multiple init.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With