I am following this tutorial about assembly.
According to the tutorial (which I also tried locally, and got similar results), the following source code:
int natural_generator() { int a = 1; static int b = -1; b += 1; /* (1, 2) */ return a + b; }
Compiles to these assembly instructions:
$ gdb static (gdb) break natural_generator (gdb) run (gdb) disassemble Dump of assembler code for function natural_generator: push %rbp mov %rsp,%rbp movl $0x1,-0x4(%rbp) mov 0x177(%rip),%eax # (1) add $0x1,%eax mov %eax,0x16c(%rip) # (2) mov -0x4(%rbp),%eax add 0x163(%rip),%eax # 0x100001018 <natural_generator.b> pop %rbp retq End of assembler dump.
(Line number comments (1)
, (2)
and (1, 2)
added by me.)
Question: why is, in the compiled code, the address of the static variable b
relative to the instruction pointer (RIP), which constantly changes (see lines (1)
and (2)
), and thus generates more complicated assembly code, rather than being relative to the specific section of the executable, where such variables are stored?
According to the mentioned tutorial, there is such a section:
This is because the value for
b
is hardcoded in a different section of the sample executable, and it’s loaded into memory along with all the machine code by the operating system’s loader when the process is launched.
(Emphasis mine.)
There are two main reasons why RIP-relative addressing is used to access the static variable b
. The first is that it makes the code position independent, meaning if it's used in a shared library or position independent executable the code can be more easily relocated. The second is that it allows the code to be loaded anywhere in the 64-bit address space without requiring huge 8 byte (64-bit) displacements to be encoded in the instruction, which aren't supported by 64-bit x86 CPUs anyways.
You mention that the compiler could instead generate code that referenced the variable relative to the beginning of the section it lives in. While its true doing this would also have the same advantages as given above, it wouldn't make the assembly any less complicated. In fact it will make it more complicated. The generated assembly code would first have to calculate the address of the section the variable lives in, since it would only know its location relative to the instruction pointer. It would then have to store it in a register, so accesses to b
(and any other variables in the section) can be made relative to that address.
Since 32-bit x86 code doesn't support RIP-relative addressing, your alternate solution is fact what the compiler does when generating 32-bit position independent code. It places the variable b
in the global offset table (GOT), and then accesses the variable relative to the base of the GOT. Here's the assembly generated by your code when compiled with gcc -m32 -O3 -fPIC -S test.c
:
natural_generator:
call __x86.get_pc_thunk.cx
addl $_GLOBAL_OFFSET_TABLE_, %ecx
movl b.1392@GOTOFF(%ecx), %eax
leal 1(%eax), %edx
addl $2, %eax
movl %edx, b.1392@GOTOFF(%ecx)
ret
The first function call places the address of the following instruction in ECX. The next instruction calculates the address of the GOT by adding the relative offset of the GOT from the start of the instruction. The variable ECX now contains the address of the GOT and is used as a base when accessing the variable b
in the rest of the code.
Compare that to 64-bit code generated by gcc -m64 -O3 -S test.c
:
natural_generator:
movl b.1745(%rip), %eax
leal 1(%rax), %edx
addl $2, %eax
movl %edx, b.1745(%rip)
ret
(The code is different than the example in your question because optimization is turned on. In general its a good idea to only look at optimized output, as without optimization the compiler often generates terrible code that does a lot of useless things. Also note that the -fPIC
flag doesn't need to be used, as the compiler generates 64-bit position independent code regardless.)
Notice how there's two fewer assembly instructions in the 64-bit version making it the less complicated version. You can also see that the code uses one less register (ECX). While it doesn't make much of a difference in your code, in a more complicated example that's a register that could've been used for something else. That makes the code even more complicated as the compiler needs to do more juggling of registers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With