Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the address of static variables relative to the Instruction Pointer?

I am following this tutorial about assembly.

According to the tutorial (which I also tried locally, and got similar results), the following source code:

int natural_generator()
{
        int a = 1;
        static int b = -1;
        b += 1;              /* (1, 2) */
        return a + b;
}

Compiles to these assembly instructions:

$ gdb static
(gdb) break natural_generator
(gdb) run
(gdb) disassemble
Dump of assembler code for function natural_generator:
push   %rbp
mov    %rsp,%rbp
movl   $0x1,-0x4(%rbp)
mov    0x177(%rip),%eax        # (1)
add    $0x1,%eax
mov    %eax,0x16c(%rip)        # (2)
mov    -0x4(%rbp),%eax
add    0x163(%rip),%eax        # 0x100001018 <natural_generator.b>
pop    %rbp
retq   
End of assembler dump.

(Line number comments (1), (2) and (1, 2) added by me.)

Question: why is, in the compiled code, the address of the static variable b relative to the instruction pointer (RIP), which constantly changes (see lines (1) and (2)), and thus generates more complicated assembly code, rather than being relative to the specific section of the executable, where such variables are stored?

According to the mentioned tutorial, there is such a section:

This is because the value for b is hardcoded in a different section of the sample executable, and it’s loaded into memory along with all the machine code by the operating system’s loader when the process is launched.

(Emphasis mine.)

like image 805
Attilio Avatar asked Oct 30 '16 12:10

Attilio


1 Answers

There are two main reasons why RIP-relative addressing is used to access the static variable b. The first is that it makes the code position independent, meaning if it's used in a shared library or position independent executable the code can be more easily relocated. The second is that it allows the code to be loaded anywhere in the 64-bit address space without requiring huge 8 byte (64-bit) displacements to be encoded in the instruction, which aren't supported by 64-bit x86 CPUs anyways.

You mention that the compiler could instead generate code that referenced the variable relative to the beginning of the section it lives in. While its true doing this would also have the same advantages as given above, it wouldn't make the assembly any less complicated. In fact it will make it more complicated. The generated assembly code would first have to calculate the address of the section the variable lives in, since it would only know its location relative to the instruction pointer. It would then have to store it in a register, so accesses to b (and any other variables in the section) can be made relative to that address.

Since 32-bit x86 code doesn't support RIP-relative addressing, your alternate solution is fact what the compiler does when generating 32-bit position independent code. It places the variable b in the global offset table (GOT), and then accesses the variable relative to the base of the GOT. Here's the assembly generated by your code when compiled with gcc -m32 -O3 -fPIC -S test.c:

natural_generator:
        call    __x86.get_pc_thunk.cx
        addl    $_GLOBAL_OFFSET_TABLE_, %ecx
        movl    b.1392@GOTOFF(%ecx), %eax
        leal    1(%eax), %edx
        addl    $2, %eax
        movl    %edx, b.1392@GOTOFF(%ecx)
        ret

The first function call places the address of the following instruction in ECX. The next instruction calculates the address of the GOT by adding the relative offset of the GOT from the start of the instruction. The variable ECX now contains the address of the GOT and is used as a base when accessing the variable b in the rest of the code.

Compare that to 64-bit code generated by gcc -m64 -O3 -S test.c:

natural_generator:
        movl    b.1745(%rip), %eax
        leal    1(%rax), %edx
        addl    $2, %eax
        movl    %edx, b.1745(%rip)
        ret

(The code is different than the example in your question because optimization is turned on. In general its a good idea to only look at optimized output, as without optimization the compiler often generates terrible code that does a lot of useless things. Also note that the -fPIC flag doesn't need to be used, as the compiler generates 64-bit position independent code regardless.)

Notice how there's two fewer assembly instructions in the 64-bit version making it the less complicated version. You can also see that the code uses one less register (ECX). While it doesn't make much of a difference in your code, in a more complicated example that's a register that could've been used for something else. That makes the code even more complicated as the compiler needs to do more juggling of registers.

like image 67
Ross Ridge Avatar answered Dec 04 '22 03:12

Ross Ridge