To study how the object file loaded and run in linux, I made the simplest c code, file name simple.c.
int main(){}
Next, I make object file and save object file as text file.
$gcc ./simple.c
$objdump -xD ./a.out > simple.text
From many internet articles, I could catch that gcc dynamically load initiating functions like _start, _init, __libc_start_main@plt, and so on. So I started to read my assembly code, helped by http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html .
Here is the some part of assembly code.
080482e0 <__libc_start_main@plt>:
80482e0: ff 25 10 a0 04 08 jmp *0x804a010
80482e6: 68 08 00 00 00 push $0x8
80482eb: e9 d0 ff ff ff jmp 80482c0 <_init+0x2c>
Disassembly of section .text:
080482f0 <_start>:
80482f0: 31 ed xor %ebp,%ebp
80482f2: 5e pop %esi
80482f3: 89 e1 mov %esp,%ecx
80482f5: 83 e4 f0 and $0xfffffff0,%esp
80482f8: 50 push %eax
80482f9: 54 push %esp
80482fa: 52 push %edx
80482fb: 68 70 84 04 08 push $0x8048470
8048300: 68 00 84 04 08 push $0x8048400
8048305: 51 push %ecx
8048306: 56 push %esi
8048307: 68 ed 83 04 08 push $0x80483ed
804830c: e8 cf ff ff ff call 80482e0 <__libc_start_main@plt>
8048311: f4 hlt
8048312: 66 90 xchg %ax,%ax
8048314: 66 90 xchg %ax,%ax
8048316: 66 90 xchg %ax,%ax
8048318: 66 90 xchg %ax,%ax
804831a: 66 90 xchg %ax,%ax
804831c: 66 90 xchg %ax,%ax
804831e: 66 90 xchg %ax,%ax
080483ed <main>:
80483ed: 55 push %ebp
80483ee: 89 e5 mov %esp,%ebp
80483f0: b8 00 00 00 00 mov $0x0,%eax
80483f5: 5d pop %ebp
80483f6: c3 ret
80483f7: 66 90 xchg %ax,%ax
80483f9: 66 90 xchg %ax,%ax
80483fb: 66 90 xchg %ax,%ax
80483fd: 66 90 xchg %ax,%ax
80483ff: 90 nop
...
Disassembly of section .got:
08049ffc <.got>:
8049ffc: 00 00 add %al,(%eax)
...
Disassembly of section .got.plt:
0804a000 <_GLOBAL_OFFSET_TABLE_>:
804a000: 14 9f adc $0x9f,%al
804a002: 04 08 add $0x8,%al
...
804a00c: d6 (bad)
804a00d: 82 (bad)
804a00e: 04 08 add $0x8,%al
804a010: e6 82 out %al,$0x82
804a012: 04 08 add $0x8,%al
My question is;
In 0x804830c, 0x80482e0 is called (I've already apprehended the previous instructions.).
In 0x80482e0, the process jump to 0x804a010.
In 0x804a010, the instruction is < out %al,$0x82 >
...wait. just out? What was in the %al and where is 0x82?? I got stuck in this line.
Please help....
*p.s. I'm beginner to linux and operating system. I'm studying operating system concepts by school class, but still can not find how to study proper linux assembly language. I've already downloaded intel processor manual but it is too huge to read. Can anyone inform me good material for me? Thanks.
The __libc_start_main() function shall perform any necessary initialization of the execution environment, call the main function with appropriate arguments, and handle the return from main(). If the main() function returns, the return value shall be passed to the exit() function.
Both __libc_csu_init and call_init do basically the same thing: They run all constructors registered in the dynamic table entries INIT and INIT_ARRAY .
The _start function is defined in the sysdeps/x86_64/start. S assembly file and does preparation like getting argc/argv from the stack, stack preparation and etc., before the __libc_start_main function will be called. The __libc_start_main function from the csu/libc-start.
80482e0: ff 25 10 a0 04 08 jmp *0x804a010
This means "retrieve the 4-byte address stored at 0x804a010 and jump to it."
804a010: e6 82 out %al,$0x82
804a012: 04 08 add $0x8,%al
Those 4 bytes will be treated as an address, 0x80482e6, not as instructions.
80482e0: ff 25 10 a0 04 08 jmp *0x804a010
80482e6: 68 08 00 00 00 push $0x8
80482eb: e9 d0 ff ff ff jmp 80482c0 <_init+0x2c>
So we've just executed an instruction that has moved us exactly one instruction forward. At this point, you're probably wondering if there's a good reason for this.
There is. This is a typical PLT/GOT implementation. Much more detail, including a diagram, is at Position Independent Code in shared libraries: The Procedure Linkage Table.
The real code for __libc_start_main
is in a shared library, glibc
. The compiler and compile-time linker don't know where the code will be at run-time, so they place in your compiled program a short __libc_start_main
function which contains just three instructions:
The first time you call __libc_start_main
, the resolver code will run. It will find the actual location of __libc_start_main
in a shared library and will patch the 4th entry of the GOT to be that address. If your program calls __libc_start_main
again, the jmp *0x804a010
instruction will take the program directly to the code in the shared library.
Can anyone inform me good material for me?
The x86 Assembly book at Wikibooks might be one place to start.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With