Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can ptrace tell if an x86 system call used the 64-bit or 32-bit ABI?

I'm trying to use ptrace to trace all syscalls made by a separate process, be it 32-bit (IA-32) or 64-bit (x86-64). My tracer would run on a 64-bit x86 installation with IA-32 emulation enabled, but ideally would be able to trace both 64-bit and 32-bit applications, including if a 64-bit application forks and execs a 32-bit process.

The issue is that, since 32-bit and 64-bit syscall numbers differ, I need to know whether a process is 32-bit or 64-bit to determine which syscall it used, even if I have the syscall number. There seem to be imperfect methods, like checking /proc/<pid>/exec or (as strace does) the size of the registers struct, but nothing reliable.

Complicating this is the fact that 64-bit processes can switch out of long mode to execute 32-bit code directly. They can also make 32-bit int $0x80 syscalls, which, of course, use the 32-bit syscall numbers. I don't "trust" the processes I trace to not use these tricks, so I want to detect them correctly. And I've independently verified that in at least the latter case, ptrace sees the 32-bit syscall numbers and argument register assignments, not the 64-bit ones.

I poked around in the kernel source and came across the TS_COMPAT flag in arch/x86/include/asm/processor.h, which appears to be set whenever a 32-bit syscall is made by a 64-bit process. The only problem is that I have no idea how to access this flag from userland, or if it is even possible.

I also thought about reading the %cs and comparing it to $0x23 or $0x33, inspired by this method for switching bitness in a running process. But this only detects 32-bit processes, not necessarily 32-bit syscalls (those made with int $0x80) from a 64-bit process. It's also fragile since it relies on undocumented kernel behavior.

Finally, I noticed that the x86 architecture has a bit for long mode in the Extended Feature Enable Register MSR. But ptrace has no way of reading the MSR from a tracee, and I feel like reading it from within my tracer will be inadequate because my tracer is always running in long mode.

I'm at a loss. Perhaps I could try and use one of those hacks—at this point I'm leaning towards %cs or the /proc/<pid>/exec method—but I want something durable that will actually distinguish between 32-bit and 64-bit syscalls. How can a process using ptrace under x86-64, which has detected that its tracee made a syscall, reliably determine whether that syscall was made with the 32-bit (int $0x80) or 64-bit (syscall) ABI? Is there some other way for a user process to gain this information about another process that it is authorized to ptrace?

like image 920
ameed Avatar asked Nov 24 '18 07:11

ameed


1 Answers

Interesting, I hadn't realized that there wasn't an obvious smarter way that strace could use to correctly decode int 0x80 from 64-bit processes. (This is being worked on, see this answer for links to a proposed kernel patch to add PTRACE_GET_SYSCALL_INFO to the ptrace API. strace 4.26 already supports it on patched kernels.)

Update: now supports per-syscall detection IDK which mainline kernel version added the feature. I tested on Arch Linux with kernel version 5.5 and strace version 5.5.

e.g. this NASM source assembled into a static executable:

mov eax, 4
int 0x80
mov eax, 60
syscall

gives this trace: nasm -felf64 foo.asm && ld foo.o && strace ./a.out

execve("./foo", ["./foo"], 0x7ffcdc233180 /* 51 vars */) = 0
strace: [ Process PID=1262249 runs in 32 bit mode. ]
write(0, NULL, 0)                       = 0
strace: [ Process PID=1262249 runs in 64 bit mode. ]
exit(0)                                 = ?
+++ exited with 0 +++

strace prints a message every time a system call uses a different ABI bitness than previously. Note that the message about runs in 32 bit mode is completely wrong; it's merely using the 32-bit ABI from 64-bit mode. "Mode" has a specific technical meaning for x86-64, and this is not it.


With older kernels

As a workaround, I think you could disassemble the code at RIP and check whether it was the syscall instruction (0F 05) or not, because ptrace does let you read the target process's memory.

But for a security use-case like disallowing some system calls, this would be vulnerable to a race condition: another thread in the syscall process could rewrite the syscall bytes to int 0x80 after they execute, but before you can peek at them with ptrace.


You only need to do that if the process is running in 64-bit mode, otherwise only the 32-bit ABI is available. If it's not, you don't need to check. (The vdso page can potentially use 32-bit mode syscall on AMD CPUs that support it but not sysenter. Not checking in the first place for 32-bit processes avoids this corner case.) I think you're saying you have a reliable way to detect that at least.

(I haven't used the ptrace API directly, just the tools like strace that use it. So I hope this answer makes sense.)

like image 119
Peter Cordes Avatar answered Sep 25 '22 13:09

Peter Cordes