Where are static local variables stored in memory? Local variables can be accessed only inside the function in which they are declared.
Global static variables go into the .data segment.
If both the name of the static global and static local variable are same, how does the compiler distinguish them?
As mentioned by dasblinken, GCC 4.8 puts local statics on the same place as globals.
More precisely:
static int i = 0
goes on .bss
static int i = 1
goes on .data
Let's analyze one Linux x86-64 ELF example to see it ourselves:
#include <stdio.h>
int f() {
static int i = 1;
i++;
return i;
}
int main() {
printf("%d\n", f());
printf("%d\n", f());
return 0;
}
To reach conclusions, we need to understand the relocation information. If you've never touched that, consider reading this post first.
Compile it:
gcc -ggdb -c main.c
Decompile the code with:
objdump -S main.o
f
contains:
int f() {
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
static int i = 1;
i++;
4: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # a <f+0xa>
a: 83 c0 01 add $0x1,%eax
d: 89 05 00 00 00 00 mov %eax,0x0(%rip) # 13 <f+0x13>
return i;
13: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 19 <f+0x19>
}
19: 5d pop %rbp
1a: c3 retq
Which does 3 accesses to i
:
4
moves to the eax
to prepare for the incrementd
moves the incremented value back to memory13
moves i
to the eax
for the return value. It is obviously unnecessary since eax
already contains it, and -O3
is able to remove that.So let's focus just on 4
:
4: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # a <f+0xa>
Let's look at the relocation data:
readelf -r main.o
which says how the text section addresses will be modified by the linker when it is making the executable.
It contains:
Relocation section '.rela.text' at offset 0x660 contains 9 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000006 000300000002 R_X86_64_PC32 0000000000000000 .data - 4
We look at .rela.text
and not the others because we are interested in relocations of .text
.
Offset 6
falls right into the instruction that starts at byte 4:
4: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # a <f+0xa>
^^
This is offset 6
From our knowledge of x86-64 instruction encoding:
8b 05
is the mov
part00 00 00 00
is the address part, which starts at byte 6
AMD64 System V ABI Update tells us that R_X86_64_PC32
acts on 4 bytes (00 00 00 00
) and calculates the address as:
S + A - P
which means:
S
: the segment pointed to: .data
A
: the Added
: -4
P
: the address of byte 6 when loaded-P
is needed because GCC used RIP
relative addressing, so we must discount the position in .text
-4
is needed because RIP
points to the following instruction at byte 0xA
but P
is byte 0x6
, so we need to discount 4.
Conclusion: after linking it will point to the first byte of the .data
segment.
Static variables go into the same segment as global variables. The only thing that's different between the two is that the compiler "hides" all static variables from the linker: only the names of extern (global) variables get exposed. That is how compilers allow static variables with the same name to exist in different translation units. Names of static variables remain known during the compilation phase, but then their data is placed into the .data
segment anonymously.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With