I am trying to write a program with ptrace that tracks all system calls made by a child.
Now I have a list of system calls which are forbidden for the child. I am able to track all system calls using ptrace but I just don't know how to skip a particular system call.
Currently my tracking (parent) process gets a signal everytime child enters or exits a system call (PTRACE_SYSCALL). But if child is trying to enter a prohibited system call then I wan't to make child skip that call and move to next step. Also when I do this I want the child to know that there was a permission denied error, so I will be setting errno = 13, will that be enough?
Update: gdb provides this feature of skipping one line..what mechanism does gdb use?
How to achieve that?
UPDATE: The best way to achieve this with ptrace is to redirect the original system call to some other system call for example to nanosleep() call. This call will fail since it will receive illegal arguments. Then you just have to change the return code in EAX to -EACCES to pretend that call failed due to Permission denied error.
I found two college lectures that mention the inability to abort an initiated system call as a disadvantage of ptrace (the manpage mentions a PTRACE_SYSEMU macro that looks like could do it, but the newer headers don't have it). Theoretically, you could make use of the ptrace entry and exit stops to counteract the calls you don't want -- by swapping in bogus arguments that'll cause the system call to fail or do nothing, or by injecting code that'll counter a previous systemcall, but that seems extremely hacky.
On Linux, you should be able to achieve your goal with seccomp:
#include <fcntl.h>
#include <seccomp.h>
#include <errno.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
static int set_security(){
int rc = -1;
scmp_filter_ctx ctx;
struct scmp_arg_cmp arg_cmp[] = { SCMP_A0(SCMP_CMP_EQ, 2) };
ctx = seccomp_init(SCMP_ACT_ERRNO(ENOSYS));
/*ctx = seccomp_init(SCMP_ACT_ALLOW);*/
if (ctx == NULL)
goto out;
rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(exit), 0);
if (rc < 0)
goto out;
rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(close), 0);
if (rc < 0)
goto out;
rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 1,
SCMP_CMP(0, SCMP_CMP_EQ, 1));
if (rc < 0)
goto out;
rc = seccomp_rule_add_array(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 1,
arg_cmp);
if (rc < 0)
goto out;
rc = seccomp_load(ctx);
if (rc < 0)
goto out;
/* ... */
out:
seccomp_release(ctx);
return -rc;
}
int main(int argc, char *argv[])
{
int fd;
const char out_msg[] = "stdout test\n";
const char err_msg[] = "stderr test\n";
if(0>set_security())
return 1;
if(0>write(1, out_msg, sizeof(out_msg)))
perror("Write stdout");
if(0>write(2, err_msg, sizeof(err_msg)))
perror("Write stderr");
//This should fail with ENOSYS
if(0>(fd=open("/dev/zero", O_RDONLY)))
perror("open");
exit(0);
}
If you want to disable a system call, it's probably easiest to use symbol interposition, instead of ptrace. (Assuming you're not aiming for security against malicious binaries. If this is for security reasons, PSKocik's answer shows how to use seccomp
).
Make a shared library that provides a gettimeofday
function which just sets errno and returns without making a system call.
Use LD_PRELOAD=./my_library.so ./a.out
to get it loaded before libc.
This won't work on binaries that statically link libc, or that use inline system calls instead of the libc wrappers (e.g. mov eax, SYS_gettimeofday
/ syscall
). You can disassemble a binary and look for syscall
(x86-64) or int 0x80
(i386 ABI) to check for that.
Note that glibc's gettimeofday and clock_gettime implementations actually never make a real system call; instead they use RDTSC
and the VDSO page exported by the kernel to find out how to scale the timestamp counter into a real time. (So intercepting the library function is your only hope; a strace-style method wouldn't catch them anyway.)
BTW, failed system calls return negative error values. e.g. on x86-64, rax = -EPERM
. The glibc syscall wrappers take care of detecting negative values and setting the errno
global variable. So if you are intercepting syscall
instructions with ptrace, that's what you need to do.
gdb can skip a line by using ptrace to resume execution in a different place. That only works if you're already stopped there, though. So to use this to "skip" system calls, you'd have to set breakpoints at every system call site you want to block in the whole process.
It doesn't sound like a useful approach. If someone's actively trying to defeat it, they can just JIT-compile some code that makes a system call directly. You could prevent processes from mapping memory that's both writable and executable, and scanning it for system calls every time you detect a fault from the process jumping into memory that was requested to be executable but your mechanism just set it to writable. (So behind the scenes you catch the hardware-generated exception and flip the page from writable to executable and scan it, or back to writable but not executable.)
This sounds like a lot of kernel hacking to implement correctly, when you could just use seccomp (see the other answer) if you need something that's resistant to workarounds and static binaries.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With