Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where do static local variables go

Tags:

c

static

Where are static local variables stored in memory? Local variables can be accessed only inside the function in which they are declared.

Global static variables go into the .data segment.

If both the name of the static global and static local variable are same, how does the compiler distinguish them?

like image 569
Angus Avatar asked May 23 '13 00:05

Angus


2 Answers

As mentioned by dasblinken, GCC 4.8 puts local statics on the same place as globals.

More precisely:

  • static int i = 0 goes on .bss
  • static int i = 1 goes on .data

Let's analyze one Linux x86-64 ELF example to see it ourselves:

#include <stdio.h>

int f() {
    static int i = 1;
    i++;
    return i;
}

int main() {
    printf("%d\n", f());
    printf("%d\n", f());
    return 0;
}

To reach conclusions, we need to understand the relocation information. If you've never touched that, consider reading this post first.

Compile it:

gcc -ggdb -c main.c

Decompile the code with:

objdump -S main.o

f contains:

int f() {
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
    static int i = 1;
    i++;
   4:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # a <f+0xa>
   a:   83 c0 01                add    $0x1,%eax
   d:   89 05 00 00 00 00       mov    %eax,0x0(%rip)        # 13 <f+0x13>
    return i;
  13:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 19 <f+0x19>
}
  19:   5d                      pop    %rbp
  1a:   c3                      retq   

Which does 3 accesses to i:

  • 4 moves to the eax to prepare for the increment
  • d moves the incremented value back to memory
  • 13 moves i to the eax for the return value. It is obviously unnecessary since eax already contains it, and -O3 is able to remove that.

So let's focus just on 4:

4:  8b 05 00 00 00 00       mov    0x0(%rip),%eax        # a <f+0xa>

Let's look at the relocation data:

readelf -r main.o

which says how the text section addresses will be modified by the linker when it is making the executable.

It contains:

Relocation section '.rela.text' at offset 0x660 contains 9 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000006  000300000002 R_X86_64_PC32     0000000000000000 .data - 4

We look at .rela.text and not the others because we are interested in relocations of .text.

Offset 6 falls right into the instruction that starts at byte 4:

4:  8b 05 00 00 00 00       mov    0x0(%rip),%eax        # a <f+0xa>
          ^^
          This is offset 6

From our knowledge of x86-64 instruction encoding:

  • 8b 05 is the mov part
  • 00 00 00 00 is the address part, which starts at byte 6

AMD64 System V ABI Update tells us that R_X86_64_PC32 acts on 4 bytes (00 00 00 00) and calculates the address as:

S + A - P

which means:

  • S: the segment pointed to: .data
  • A: the Added: -4
  • P: the address of byte 6 when loaded

-P is needed because GCC used RIP relative addressing, so we must discount the position in .text

-4 is needed because RIP points to the following instruction at byte 0xA but P is byte 0x6, so we need to discount 4.

Conclusion: after linking it will point to the first byte of the .data segment.


Static variables go into the same segment as global variables. The only thing that's different between the two is that the compiler "hides" all static variables from the linker: only the names of extern (global) variables get exposed. That is how compilers allow static variables with the same name to exist in different translation units. Names of static variables remain known during the compilation phase, but then their data is placed into the .data segment anonymously.

like image 32
Sergey Kalinichenko Avatar answered Sep 22 '22 15:09

Sergey Kalinichenko