Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does a linker work exactly (microcontroller context)?

I've been programming in C and C++ for quite a long time now, so I'm familiar with the linking process as a user: the preprocessor expands all prototypes and macros in each .c file which is then compiled separately into its own object file, and all object files together with static libraries are linked into an executable.

However I'd like to know more about this process: how does the linker link the object files (what do they contain anyway?)? Matching declared but undefined functions with their definitions in other files (how?)? Translating into the exact content of the program memory (context: microcontrollers)?

Application example

Ideally, I'm looking for a detailed step-by-step description of what the process is doing, based on the following simplistic example. Since it doesn't appear to be said anywhere, fame and glory to whoever answers in this way.

main.c

#include "otherfile.h"

int main(void) {
   otherfile_print("Foo");

   return 0;
}

otherfile.h

void otherfile_print(char const *);

otherfile.c

#include "otherfile.h"
#include <stdio.h>

void otherfile_print(char const *str) {
   printf(str);
}
like image 790
Mister Mystère Avatar asked Dec 25 '22 03:12

Mister Mystère


1 Answers

printf is insanely complicated, very bad for a microcontroller hello world example, blinking leds are better but that gets specific to the microcontroller. this will suffice for linking.

two.c

unsigned int glob;
unsigned int two ( unsigned int a, unsigned int b )
{
    glob=5;
    return(a+b+7);
}

one.c

extern unsigned int glob;
unsigned int two ( unsigned int, unsigned int );
unsigned int one ( void )
{
    return(two(5,6)+glob);
}

start.s

.globl _start
_start:
    bl one
    b .

build everything.

% arm-none-eabi-gcc -O2 -c one.c -o one.o
% arm-none-eabi-gcc -O2 -c two.c -o two.o
% touch start.s
% arm-none-eabi-gcc -Wall -O2 -nostdlib -nostartfiles -ffreestanding -c one.c -o one.o
% arm-none-eabi-gcc -Wall -O2 -nostdlib -nostartfiles -ffreestanding -c two.c -o two.o
% arm-none-eabi-as start.s -o start.o
% arm-none-eabi-ld -Ttext=0x10000000 start.o one.o two.o -o onetwo.elf

now lets look...

arm-none-eabi-objdump -D start.o
...
00000000 <_start>:
   0:   ebfffffe    bl  0 <one>
   4:   eafffffe    b   4 <_start+0x4>

it not is the compiler/assemblers job to deal with external references so the branch link to one is left incomplete, they chose to make it a bl of 0 but they could have simply left it totally unencoded, it is up to the authors of the toolchain as to how to communicate between the compiler, assembler, and linker via object files.

Same here

00000000 <one>:
   0:   e92d4008    push    {r3, lr}
   4:   e3a00005    mov r0, #5
   8:   e3a01006    mov r1, #6
   c:   ebfffffe    bl  0 <two>
  10:   e59f300c    ldr r3, [pc, #12]   ; 24 <one+0x24>
  14:   e5933000    ldr r3, [r3]
  18:   e0800003    add r0, r0, r3
  1c:   e8bd4008    pop {r3, lr}
  20:   e12fff1e    bx  lr
  24:   00000000    andeq   r0, r0, r0

both the function two and the address for the global variable glob are unknown. Note that for the unknown variable the compiler generates code that requires the explicit address of the global so that the linker simply needs to fill in the address, also glob is .data not .text.

00000000 <two>:
   0:   e59f3010    ldr r3, [pc, #16]   ; 18 <two+0x18>
   4:   e2811007    add r1, r1, #7
   8:   e3a02005    mov r2, #5
   c:   e0810000    add r0, r1, r0
  10:   e5832000    str r2, [r3]
  14:   e12fff1e    bx  lr
  18:   00000000    andeq   r0, r0, r0

here too the global is in .data not here, so the linker will have to place .data and the things in it and then fill in the addresses.

so here we have linked it all together, the gnu linker requires an entry point label defined _start (main is an extern address required by the standard bootstrap, which I am not using so we dont get a main not found error). Because I am not using a linker script the gnu linker places items in the binary in the order they were defined on the command line, as desired i need start first for a microcontroller since I am controlling the boot. I used a non-zero here for demonstration purposes as well...

10000000 <_start>:
10000000:   eb000000    bl  10000008 <one>
10000004:   eafffffe    b   10000004 <_start+0x4>

10000008 <one>:
10000008:   e92d4008    push    {r3, lr}
1000000c:   e3a00005    mov r0, #5
10000010:   e3a01006    mov r1, #6
10000014:   eb000005    bl  10000030 <two>
10000018:   e59f300c    ldr r3, [pc, #12]   ; 1000002c <one+0x24>
1000001c:   e5933000    ldr r3, [r3]
10000020:   e0800003    add r0, r0, r3
10000024:   e8bd4008    pop {r3, lr}
10000028:   e12fff1e    bx  lr
1000002c:   1000804c    andne   r8, r0, ip, asr #32

10000030 <two>:
10000030:   e59f3010    ldr r3, [pc, #16]   ; 10000048 <two+0x18>
10000034:   e2811007    add r1, r1, #7
10000038:   e3a02005    mov r2, #5
1000003c:   e0810000    add r0, r1, r0
10000040:   e5832000    str r2, [r3]
10000044:   e12fff1e    bx  lr
10000048:   1000804c    andne   r8, r0, ip, asr #32

Disassembly of section .bss:

1000804c <__bss_start>:
1000804c:   00000000    andeq   r0, r0, r0

so the linker starts to place the first item start.o, it roughly figures out how big that needs to be by just putting what was there. those two instructions. they take 8 bytes so in theory the second item one.o goes next at 0x10000008. That means the encoding for the bl one in start.s can be completed to use the correct relative address (_start + 8 which is the value of the pc when executing so the offset is zero, pc+0 is the encoding)

the linker has roughly placed one.o into the binary it is building and it has to resolve the address to two and the global so it has to place two.o and then figure out where the end of that is to place in this case .bss not .data since I didnt pre-init the variable.

the label for two is at 0x10000030 so it encodes the bl two in one() for that relative offset, it has also placed glob at 1000804c for some reason (I didnt complete define where ram was so the gnu linker will do things like this). Despite the reason, that is where the linker defined the home for that global variable and where the address to glob is needed is filled in by the linker, both one() and two() needed those filled in.

So the compiler (assembler) and linker have to in the end result in a usable binary, the compiler (assembler) tend to worry about making position independent machine code and leave enough information for the linker so that it has the machine code and a list of unresolved externs that it has to fill in. compilers have improved over time, a simple model would be to have an address location like they did above for the global variables address, where the linker computes the absolute address and just fills it in, clearly above they did not encode the function call in a way that it can use an absolute address to one and two. instead it uses pc relative addressing. This means that the linker has to know the machine code encoding of the bl instruction. the current generation of gnu linker knows quite a bit more and can do some cool things resolving arm to thumb and back, stuff it didnt used to know (you dont need to compile for thumb interwork anymore the linker takes care of it).

So the linker takes binary blobs including data and...links them together into one binary. It first needs to know the actual addresses for the various things in the binary. How you tell the linker this is linker specific and not a global thing for all C/C++ toolchains. Gnu linker scripts are a programming language in and of themselves. These are not necessarily physical nor virtual addresses it is simply the address space of the code in whatever mode it is in (virtual or physical). Once the linker knows the addresses it, based on linker rules (again linker specific) it starts placing these various binary blobs into those address spaces. then it goes through and resolves the external/global addresses. It was not above but can be an iterative process. If for example the function two() was at an address in memory that cannot be accessed with a single pc relative instruction (say we put one near zero and two near 0xF0000000) then those that wrote the linker have two choices, the simple choice is to simply state that it cannot encode/implement that far of a branch and bail out and gnu linker did or still does do that. Or the other solution is the linker fixes the problem. the linker could add a few words of data within the range of the pc relative branch link and those few words of data are a trampoline for example an absolute address that is loaded into a register then a register based branch or perhaps of clever a pc relative branch if the trampoline is within range (in the case of 0x10000000 to 0xF0000000 that wouldnt work). If the linker has to add these few words then that may mean that some of the binary blobs have to move to make room for those few words and now all of the addresses in those binary blobs now have to move as well. So you have to make another pass across all the binary blobs, resolving all of the new addresses filling in the answers and for pc relative determining if you can still reach everything. Adding those few words might have made something that was reachable with a pc-relative now unreachable and now that requires a solution (error or patch).

The assembler itself for a single source file has to go through even more of these gyrations esp for a variable length instruction set like x86 where the addressing is a big vague. I recommend trying for yourself to make a simple assembler that only supports a few instructions but some of those branches. and parse and encode the instructions and compare that to an existing debugged assembler like gnu assembler.

test.s

   ldr r1,locdat
   nop
   nop
   nop
   nop
   nop
   b over
locdat: .word 0x12345678
top:
    nop
    nop
    nop
    nop
    nop
    nop
over:
    b top

the right answer is

00000000 <locdat-0x1c>:
   0:   e59f1014    ldr r1, [pc, #20]   ; 1c <locdat>
   4:   e1a00000    nop         ; (mov r0, r0)
   8:   e1a00000    nop         ; (mov r0, r0)
   c:   e1a00000    nop         ; (mov r0, r0)
  10:   e1a00000    nop         ; (mov r0, r0)
  14:   e1a00000    nop         ; (mov r0, r0)
  18:   ea000006    b   38 <over>

0000001c <locdat>:
  1c:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000

00000020 <top>:
  20:   e1a00000    nop         ; (mov r0, r0)
  24:   e1a00000    nop         ; (mov r0, r0)
  28:   e1a00000    nop         ; (mov r0, r0)
  2c:   e1a00000    nop         ; (mov r0, r0)
  30:   e1a00000    nop         ; (mov r0, r0)
  34:   e1a00000    nop         ; (mov r0, r0)

00000038 <over>:
  38:   eafffff8    b   20 <top>

there are parallels to that activity and the job of a linker. also you could fashion a simple linker based on the above files or something similar, extract the binary blobs and other info and start placing them in whatever address space you want.

Either one are fairly simple programming tasks, yet fairly educational. Having an existing toolchain that can produce the answer you can figure out where you are going wrong or how to get at the right answer.

like image 155
old_timer Avatar answered Dec 27 '22 17:12

old_timer