Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linux x86_64 assembly socket programming

Hello all.

So I'm learning assembly.
And as per my usual learning steps with any new language I pick up I've arrived at networking with assembly.

Which, sadly isn't going that well as I've pretty much failed at step 0, which would be getting a socket through which communication can begin.

The assembly code should be roughly equal to the following C code:

#include <stdio.h>
#include <sys/socket.h>

int main(){
        int sock;
        sock = socket(AF_INET, SOCK_STREAM, 0);
}

(Let's ignore the fact that it's not closing the socket for now.)

So here's what I did thus far:

  • Checked the manual. Which would imply that I need to make a socketcall() this is all good and well. The problem starts with that it would need an int that describes what sort of socketcall it should make. The calls manpage isn't helping much with this either as it only describes that:

On a some architectures—for example, x86-64 and ARM—there is no socketcall() system call; instead socket(2), accept(2), bind(2), and so on really are implemented as separate system calls.

  • Yet there are no such calls in the original list of syscalls - and as far as I know the socket(), accept(), bind(), listen(), etc. are calls from libnet and not from the kernel. This got me utterly confused so I've decided to compile the above C code and check up on it with strace. This yielded the following:

    socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3

  • While that didn't got me any closer to knowing what socket() is it did explain it's arguments. For witch I don't seem to find the proper documentation (again). I thought that PF_INET, SOCK_STREAM, IPPROTO_IP would be defined in <sys/socket.h> but my grep-ing for them didn't seem to find anything of use. So I decided to just wing it by using gdb in tandem with disass main to find the values. This gave the following output:

    Dump of assembler code for function main: 0x00000000004004fd <+0>: push rbp 0x00000000004004fe <+1>: mov rbp,rsp 0x0000000000400501 <+4>: sub rsp,0x10 0x0000000000400505 <+8>: mov edx,0x0 0x000000000040050a <+13>: mov esi,0x1 0x000000000040050f <+18>: mov edi,0x2 0x0000000000400514 <+23>: call 0x400400 0x0000000000400519 <+28>: mov DWORD PTR [rbp-0x4],eax 0x000000000040051c <+31>: leave
    0x000000000040051d <+32>: ret
    End of assembler dump.

  • In my experience this would imply that socket() gets it's parameters from EDX (PF_INET), ESI (SOCK_STREAM), and EDI (IPPROTO_IP). Which would be odd for a syscall (as the convention with linux syscalls would be to use EAX/RAX for the call number and other registers for the parameters in increasing order, eg. RBX, RCX, RDX ...). The fact that this is beaing CALL-ed and not INT 0x80'd would also imply that this is not in fact a system call but rather something thats being called from a shared object. Or something.

  • But then again. Passing arguments in registers is very odd for something that's CALL-ed. Normally as far as I know argument's for called things should be PUSH-ed onto the stack, as the compiler can't know what registers they would try to use.

  • This behavior becomes even more curious when checking the produced binary with ldd:

    linux-vdso.so.1 (0x00007fff4a7fc000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f56b0c61000) /lib64/ld-linux-x86-64.so.2 (0x00007f56b1037000)

  • There appears to be no networking library's linked.

And that's the point where I've ran out of ideas.

So I'm asking for the following:

  • A documentation that describes the x86-64 linux kernel's actual syscalls and their associated numbers. (Preferably as a header file for C.)
  • The header files that define PF_INET, SOCK_STREAM, IPPROTO_IP as it really bugs me that I wasn't able to find them on my own system.
  • Maybe a tutorial for networking in assembly on x86-64 linux. (For x86-32 it's easy to find material but for some reason I came up empty with the 64 bits stuff.)

Thanks!

like image 991
Wolfer Avatar asked Nov 12 '14 16:11

Wolfer


People also ask

Which assembly language is used in Linux?

The GNU Assembler, commonly known as gas or as, is the assembler developed by the GNU Project. It is the default back-end of GCC. It is used to assemble the GNU operating system and the Linux kernel, and various other software.

How do I run a NASM program in Ubuntu?

Open a Linux terminal. Type whereis nasm and press ENTER. If it is already installed, then a line like, nasm: /usr/bin/nasm appears. Otherwise, you will see just nasm:, then you need to install NASM.


1 Answers

The 64 bit calling convention does use registers to pass arguments, both in user space and to system calls. As you have seen, the user space convention is rdi,rsi, rdx, rcx, r8, r9. For system calls, r10 is used instead of rcx which is clobbered by the syscall instruction. See wikipedia or the ABI documentation for more details.

The definitions of the various constants are hidden in header files, which are nevertheless easily found via a file system search assuming you have the necessary development packages installed. You should look in /usr/include/x86_64-linux-gnu/bits/socket.h and /usr/include/linux/in.h.

As for a system call list, it's trivial to google one, such as this. You can also always look in the kernel source of course.

like image 119
Jester Avatar answered Oct 11 '22 18:10

Jester