I am trying to modify linux system call's default behavior. At the moment I am trying to hook and add a simple print statement before they are actually getting invoked. I know about the standard 'wrap' option of GCC linker and how it can be used to hook wrappers Link to GCC Linker options. This perfectly works for open(), fstat(), fwrite() etc (where I am actually hooking the libc wrappers).
UPDATE:
The limitation is that NOT all system calls gets hooked up with this approach. To illustrate that let us take a simple statically compiled binary. When we try adding wrappers, they are getting effected from the calls that we introduce after main() (Please see the strace output shown below)
> strace ./sample
execve("./sample", ["./sample"], [/* 72 vars */]) = 0
uname({sys="Linux", node="kumar", ...}) = 0
brk(0) = 0x71f000
brk(0x7201c0) = 0x7201c0
arch_prctl(ARCH_SET_FS, 0x71f880) = 0
readlink("/proc/self/exe", "/home/admin/sample"..., 4096) = 41
brk(0x7411c0) = 0x7411c0
brk(0x742000) = 0x742000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbcc54d1000
write(1, "Hello from the wrapped readlink "..., 36Hello from the wrapped readlink :з
) = 36
readlink("/usr/bin/gnome-www-browser", "/etc/alternatives/gnome-www-brow"..., 255) = 35
write(1, "/etc/alternatives/gnome-www-brow"..., 36/etc/alternatives/gnome-www-browser
) = 36
exit_group(36) = ?
+++ exited with 36 +++
If we notice the binary carefully the first "un-intercepted" call readlink() (system call 89 i.e. 0x59) comes from these lines -- some linker related code portion (i.e. _dl_get_origin) does a readlink() for its functioning. These implicit syscall (though present in binary code) are never getting hooked up by our "wrap" approach.
000000000051875c <_dl_get_origin>:
51875c: b8 59 00 00 00 mov $0x59,%eax
518761: 55 push %rbp
518762: 53 push %rbx
518763: 48 81 ec 00 10 00 00 sub $0x1000,%rsp
51876a: 48 89 e6 mov %rsp,%rsi
51876d: 0f 05 syscall
How to extend the wrapping idea to system calls like readlink() (including all the implicit ones being invoked) ?
ld have an option for wrapping, the quote from manual:
--wrap symbol
Use a wrapper function for symbol. Any undefined reference to symbol will be resolved to __wrap_symbol. Any undefined reference to __real_symbol will be resolved to symbol. This can be used to provide a wrapper for a system function. The wrapper function should be called __wrap_symbol. If it wishes to call the system function, it should call __real_symbol.
It works fine with system calls too. Here's an example with readlink
:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
ssize_t __real_readlink(const char *path, char *buf, size_t bufsiz);
ssize_t __wrap_readlink(const char *path, char *buf, size_t bufsiz) {
puts("Hello from the wrapped readlink :з");
__real_readlink(path, buf, bufsiz);
}
int main(void) {
const char testLink[] = "/usr/bin/gnome-www-browser";
char buf[256];
memset(buf, 0, sizeof(buf));
readlink(testLink, buf, sizeof(buf)-1);
puts(buf);
}
To pass the option to the linker from the compiler use -Wl
option:
$ gcc test.c -o a -Wl,--wrap=readlink
$ ./a
Hello from the wrapped readlink :з
/etc/alternatives/gnome-www-browser
The idea is that __wrap_func
is your function wrapper. The __real_func
linker would link with the real function func
. And every call to a func
in the code would be replaced with __wrap_func
.
UPD: One may notice that a binary being compiled statically calls another readlink
, which aren't being intercepted. To understand the reason, just do a little experiment — compile the code to the object file, and list the symbols, like:
$ gcc test.c -c -o a.o -Wl,--wrap=readlink
$ nm a.o
0000000000000037 T main
U memset
U puts
U readlink
U __real_readlink
U __stack_chk_fail
0000000000000000 T __wrap_readlink
The interesting thing here is that you won't see references to a bunch of functions that being seen with strace before entering the main function — e.g. uname()
, brk()
, access()
, and etc. That is because the main function isn't the first code that being called in your binary. A bit of research with objdump
will show you, that the first function called _start
.
Now, let's do another example — override the _start
function:
$ cat test2.c
#include <stdio.h>
#include <unistd.h>
void _start() {
puts("Hello");
_exit(0);
}
$ gcc test2.c -o a -nostartfiles
$ strace ./a
execve("./a", ["./a"], [/* 69 vars */]) = 0
brk(0) = 0x150c000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3ece55d000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=177964, ...}) = 0
mmap(NULL, 177964, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3ece531000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\37\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1840928, ...}) = 0
mmap(NULL, 3949248, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3ecdf78000
mprotect(0x7f3ece133000, 2093056, PROT_NONE) = 0
mmap(0x7f3ece332000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ba000) = 0x7f3ece332000
mmap(0x7f3ece338000, 17088, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3ece338000
close(3) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3ece530000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3ece52e000
arch_prctl(ARCH_SET_FS, 0x7f3ece52e740) = 0
mprotect(0x7f3ece332000, 16384, PROT_READ) = 0
mprotect(0x600000, 4096, PROT_READ) = 0
mprotect(0x7f3ece55f000, 4096, PROT_READ) = 0
munmap(0x7f3ece531000, 177964) = 0
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 10), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3ece55c000
write(1, "Hello\n", 6Hello
) = 6
exit_group(0) = ?
+++ exited with 0 +++
$
What was it?! We just overridden the first function in the binary, and still see the system calls — why?
Actually it is because the calls being executed not by your application, but rather by the kernel before your application being loaded into the memory, and allowed to run.
UPD: as we saw previously, the functions aren't called by your application. Honestly, I couldn't find what's being done for statical binaries after a shell calls execve
for your app, but from the list it looks like every call you see being done by the kernel itself — without any side application, like dynamic linker which aren't needed for statical binaries (and because there're functions like brk
that works with data segments).
Whatever, you surely can not modify this behavior that easy, you will need some hacking. Because if you could easily override a function for the code which is executed before your binary run — i.e. from the other binary — it would be a big black hole in the security, just imagine: once you need a root rights, you override a function with one to execute your code, and wait a bit while some daemon with root rights happen to execute a script, and thus trigger your code into play.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With