Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is global _start in assembly language?

This is my assembly level code ...

section .text global _start _start: mov eax, 4         mov ebx, 1         mov ecx, mesg         mov edx, size         int 0x80 exit:   mov eax, 1         int 0x80 section .data mesg    db      'KingKong',0xa size    equ     $-mesg 

Output:

root@bt:~/Arena# nasm -f elf a.asm -o a.o root@bt:~/Arena# ld -o out a.o root@bt:~/Arena# ./out  KingKong 

My question is What is the global _start used for? I tried my luck with Mr.Google and I found that it is used to tell the starting point of my program. Why cant we just have the _start to tell where the program starts like the one given below which produces a kinda warning on the screen

section .text _start: mov eax, 4         mov ebx, 1         mov ecx, mesg         mov edx, size         int 0x80 exit:   mov eax, 1         int 0x80 section .data mesg    db      'KingKong',0xa size    equ     $-mesg  root@bt:~/Arena# nasm -f elf a.asm root@bt:~/Arena# ld -e _start -o out a.o ld: warning: cannot find entry symbol _start; defaulting to 0000000008048080 root@bt:~/Arena# ld -o out a.o ld: warning: cannot find entry symbol _start; defaulting to 0000000008048080 
like image 780
vikkyhacks Avatar asked Jul 27 '13 14:07

vikkyhacks


People also ask

What is _start in assembly language?

_start is a label which is equivalent to the memory location of the first instruction in the program. msg is a label which is equivalent to the memory location of the first byte of the string "Hello, World!\ n"

What is the global keyword in assembly?

global main basically means that the symbol should be visible to the linker because other object files will use it. Without it, the symbol main is considered local to the object file it's assembled to, and will not appear after the assembly file is assembled.

What is .quad Assembly?

The . quad directive generates an initialized word (64-bit, two's complement value) for each expression into the current section. Each expression must be a 64-bit value, and must evaluate to an integer value.

What does .text do in assembly?

The text section is used for keeping the actual code. This section must begin with the declaration global _start, which tells the kernel where the program execution begins.


2 Answers

global directive is NASM specific. It is for exporting symbols in your code to where it points in the object code generated. Here you mark _start symbol global so its name is added in the object code (a.o). The linker (ld) can read that symbol in the object code and its value so it knows where to mark as an entry point in the output executable. When you run the executable it starts at where marked as _start in the code.

If a global directive missing for a symbol, that symbol will not be placed in the object code's export table so linker has no way of knowing about the symbol.

If you want to use a different entry point name other than _start (which is the default), you can specify -e parameter to ld like:

ld -e my_entry_point -o out a.o 
like image 135
Sedat Kapanoglu Avatar answered Oct 04 '22 22:10

Sedat Kapanoglu


_start is used by the default Binutils' ld linker script as the entry point

We can see the relevant part of that linker script with:

 ld -verbose a.o | grep ENTRY 

which outputs:

ENTRY(_start) 

The ELF file format (and other object format I suppose), explicitly say which address the program will start running at through the e_entry header field.

ENTRY(_start) tells the linker to set that entry the address of the symbol _start when generating the ELF file from object files.

Then when the OS starts running the program (exec system call on Linux), it parses the ELF file, loads the executable code into memory, and sets the instruction pointer to the specified address.

The -e flag mentioned by Sedat overrides the default _start symbol.

You can also replace the entire default linker script with the -T <script> option, here is a concrete example that sets up some bare metal assembly stuff.

.global is an assembler directive that marks the symbol as global in the ELF file

The ELF file contains some metadata for every symbol, indicating its visibility.

The easiest way to observe this is with the nm tool.

For example in a Linux x86_64 GAS freestanding hello world:

main.S

.text .global _start _start: asm_main_after_prologue:     /* write */     mov $1, %rax   /* syscall number */     mov $1, %rdi   /* stdout */     lea msg(%rip), %rsi  /* buffer */     mov $len, %rdx /* len */     syscall      /* exit */     mov $60, %rax   /* syscall number */     mov $0, %rdi    /* exit status */     syscall msg:     .ascii "hello\n"     len = . - msg 

GitHub upstream

compile and run:

gcc -ffreestanding -static -nostdlib -o main.out main.S ./main.out 

nm gives:

00000000006000ac T __bss_start 00000000006000ac T _edata 00000000006000b0 T _end 0000000000400078 T _start 0000000000400078 t asm_main_after_prologue 0000000000000006 a len 00000000004000a6 t msg 

and man nm tells us that:

If lowercase, the symbol is usually local; if uppercase, the symbol is global (external).

so we see that _global is visible externally (upper case T), but the msg which we didn't mark as .global isn't (lower case t).

The linker then knows how to blow up if multiple global symbols with the same name are seen, or do smarter things is more exotic symbol types are seen.

If we don't mark _start as global, ld becomes sad and says:

cannot find entry symbol _start