Hello all.
So I'm learning assembly.
And as per my usual learning steps with any new language I pick up I've arrived at networking with assembly.
Which, sadly isn't going that well as I've pretty much failed at step 0, which would be getting a socket through which communication can begin.
The assembly code should be roughly equal to the following C code:
#include <stdio.h>
#include <sys/socket.h>
int main(){
int sock;
sock = socket(AF_INET, SOCK_STREAM, 0);
}
(Let's ignore the fact that it's not closing the socket for now.)
So here's what I did thus far:
socketcall()
this is all good and well. The problem starts with that it would need an int
that describes what sort of socketcall it should make. The calls manpage isn't helping much with this either as it only describes that:On a some architectures—for example, x86-64 and ARM—there is no socketcall() system call; instead socket(2), accept(2), bind(2), and so on really are implemented as separate system calls.
Yet there are no such calls in the original list of syscalls - and as far as I know the socket()
, accept()
, bind()
, listen()
, etc. are calls from libnet
and not from the kernel. This got me utterly confused so I've decided to compile the above C
code and check up on it with strace
. This yielded the following:
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
While that didn't got me any closer to knowing what socket()
is it did explain it's arguments. For witch I don't seem to find the proper documentation (again). I thought that PF_INET
, SOCK_STREAM
, IPPROTO_IP
would be defined in <sys/socket.h>
but my grep
-ing for them didn't seem to find anything of use. So I decided to just wing it by using gdb
in tandem with disass main
to find the values. This gave the following output:
Dump of assembler code for function main:
0x00000000004004fd <+0>: push rbp
0x00000000004004fe <+1>: mov rbp,rsp
0x0000000000400501 <+4>: sub rsp,0x10
0x0000000000400505 <+8>: mov edx,0x0
0x000000000040050a <+13>: mov esi,0x1
0x000000000040050f <+18>: mov edi,0x2
0x0000000000400514 <+23>: call 0x400400
0x0000000000400519 <+28>: mov DWORD PTR [rbp-0x4],eax
0x000000000040051c <+31>: leave
0x000000000040051d <+32>: ret
End of assembler dump.
In my experience this would imply that socket()
gets it's parameters from EDX
(PF_INET
), ESI
(SOCK_STREAM
), and EDI
(IPPROTO_IP
). Which would be odd for a syscall (as the convention with linux syscalls would be to use EAX
/RAX
for the call number and other registers for the parameters in increasing order, eg. RBX
, RCX
, RDX
...). The fact that this is beaing CALL
-ed and not INT 0x80
'd would also imply that this is not in fact a system call but rather something thats being called from a shared object. Or something.
But then again. Passing arguments in registers is very odd for something that's CALL
-ed. Normally as far as I know argument's for called things should be PUSH
-ed onto the stack, as the compiler can't know what registers they would try to use.
This behavior becomes even more curious when checking the produced binary with ldd
:
linux-vdso.so.1 (0x00007fff4a7fc000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f56b0c61000) /lib64/ld-linux-x86-64.so.2 (0x00007f56b1037000)
There appears to be no networking library's linked.
And that's the point where I've ran out of ideas.
So I'm asking for the following:
x86-64
linux kernel's actual syscalls and their associated numbers. (Preferably as a header file for C
.)PF_INET
, SOCK_STREAM
, IPPROTO_IP
as it really bugs me that I wasn't able to find them on my own system.x86-64
linux. (For x86-32
it's easy to find material but for some reason I came up empty with the 64 bits stuff.)Thanks!
The GNU Assembler, commonly known as gas or as, is the assembler developed by the GNU Project. It is the default back-end of GCC. It is used to assemble the GNU operating system and the Linux kernel, and various other software.
Open a Linux terminal. Type whereis nasm and press ENTER. If it is already installed, then a line like, nasm: /usr/bin/nasm appears. Otherwise, you will see just nasm:, then you need to install NASM.
The 64 bit calling convention does use registers to pass arguments, both in user space and to system calls. As you have seen, the user space convention is rdi
,rsi
, rdx
, rcx
, r8
, r9
. For system calls, r10
is used instead of rcx
which is clobbered by the syscall instruction. See wikipedia or the ABI documentation for more details.
The definitions of the various constants are hidden in header files, which are nevertheless easily found via a file system search assuming you have the necessary development packages installed. You should look in /usr/include/x86_64-linux-gnu/bits/socket.h
and /usr/include/linux/in.h
.
As for a system call list, it's trivial to google one, such as this. You can also always look in the kernel source of course.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With