I'm learning asm on Linux (noobuntu 10.04) I got the following code off of: http://asm.sourceforge.net/intro/hello.html
section .text
global _start ;must be declared for linker (ld)
_start: ;tell linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db 'Hello, world!',0xa ;our dear string
len equ $ - msg ;length of our dear string
It's a simple hello world. Runs on Linux + calls the kernel directly (apparently). Can anyone please explain what is really going on here? I think it reads the integers in the eax & ebx processor registers & ecx, edx data and that defines the system call when the kernel is called. If so, do different combinations of integers define different system calls when int 0x80 is called?
I'm not good with man pages, but have read every related one I can find, does any man page tell me what combinations define what syscalls?
ANY help is appreciated. A line by line explanation would be amazing... -Thanks in advance Jeremy
Example of assembly language"EAX," "EBX" and "ECX" are the variables. The first line of code loads "3" into the register "eax." The second line of code loads "4" into the register "ebx." Finally, the last line of code adds "eax" and "ebx" and stores the result of the addition, which is seven, in "ecx."
When you call int 0x80
, the kernel looks at the value of the eax
register to determine the function you want to call (this is the "syscall number"). Depending on that number, the rest of the registers are interpreted to mean specific things. The sys_write
call expects the registers to be set up as follows:
eax
contains 4ebx
contains the file descriptorecx
contains the address of the data to writeedx
contains the number of bytesFor further extensive information, see Linux System Calls.
section .text
global _start ;must be declared for linker (ld)
This is just header material, the "text" section of an assembly program is just the machine instructions (versus the data, read-only data, and BSS sections). The global
line is akin to saying that the _start
function is "public."
_start: ;tell linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
From the comments we know that we are looking at the sys_write
function, so we can man 2 write
to get the details. The C prototype gives the following parameters: fd
, *buf
, and count
. Starting with %ebx we see that those match (%ebx = fd, %ecx = string to write, and %edx = length of string). Then, since we are a user process, we must ask the kernel to perform the output. This is done through the SYSCALL interface, and the write()
function is (apparently) given the number 4. INT 0x80
is a software interrupt that calls the Linux kernel's SYSCALL routine.
You can find the actual numbers of all the syscalls in the Linux header files (assuming you have them installed). On my system, I checked /usr/include/sys/syscall.h
leading to /usr/include/asm/unistd.h
and then onto /usr/include/asm-i386/unistd.h
. Where (I see), #define __NR_write 4
.
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
As with the last two lines of the previous segment, this just loads the syscall id and does the software interrupt to exit the program (remove it's memory mapping and cleanup).
section .data
msg db 'Hello, world!',0xa ;our dear string
len equ $ - msg ;length of our dear string
This is the data section, it just describes variables we used in our program.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With