For example, if I want to override malloc(), what's the best way to do it?
Currently the simplest way I know of is:
malloc.h
#include <stdlib.h>
#define malloc my_malloc
void* my_malloc (size_t size);
foobar.c
#include "malloc.h"
void foobar(void)
{
void* leak = malloc(1024);
}
The problem with this approach is that we now have to use "malloc.h" and can never use "stdlib.h". Is there a way around this? I'm particularly interested in importing 3rd party libraries without modifying them at all, but forcing them into calling my custom libc functions (like malloc).
Compile and link the file with your reimplementation ( override. c ). This allows you to override a single function from any source file, without having to modify the code. The downside is that you must use a separate header file for each file you want to override.
The term "libc" is commonly used as a shorthand for the "standard C library", a library of standard functions that can be used by all C programs (and sometimes by programs in other languages). Because of some history (see below), use of the term "libc" to refer to the standard C library is somewhat ambiguous on Linux.
The short answer is you probably want to use the LD_PRELOAD trick: What is the LD_PRELOAD trick?
That approach basically inserts your own custom shared library on runtime before any other shared library is loaded, exporting the functions you want to override, such as malloc(). By the time the other shared libraries are loaded your symbol is already there and gets preference when resolving calls to that symbol name from other libraries. From within your malloc() wrapper/replacement you can even chose to call the next malloc symbol, which typically would be the actual libc symbol.
This blog post has a lot of comprehensive information about this method:
http://samanbarghi.com/blog/2014/09/05/how-to-wrap-a-system-call-libc-function-in-linux/
Note that example is overriding libc's write() and puts() functions, but the same logic applies for malloc():
LD_PRELOAD allows a shared library to be loaded before any other libraries. So all I need to do is to write a shared library that overrides write and puts functions. If we wrap these functions, we need a way to call the real functions to perform the system call. dlsym just do that for us [man 3 dlsym]: > The function dlsym() takes a “handle” of a dynamic library returned by dlopen() and the null-terminated symbol name, returning the address where that symbol is loaded into memory. If the symbol is not found, in the specified library or any of the libraries that were automatically loaded by dlopen() when that library was loaded, dlsym() returns NULL…
So inside the wrapper function we can use dlsym to get the address of the related symbol in memory and call the glibc function. Another approach can be calling the syscall directly, both approaches will work.
That blog post also describes a compile-time method I did not know about that involves passing a linker flag to ld, "--wrap":
Another way of wrapping functions is by using linker at the link time. GNU linker provides an option to wrap a function for a symbol [man 1 ld]: > Use a wrapper function for symbol. Any undefined reference to symbol will be resolved to “__wrap_symbol”. Any undefined reference to “__real_symbol” will be resolved to symbol.
The handy thing about LD_PRELOAD is that might allow you to change the malloc() implementation on production applications for quick testing, or even allow the user to select (I do this in some server applications) which implementation to use. The 'tcmalloc' library for example can be easily inserted into an application to evaluate performance gains in heavily threaded applications (where tcmalloc tends to perform a lot better than libc's malloc implementation).
Finally if you're on Windows, perhaps try this: LD_PRELOAD equivalent for Windows to preload shared libraries
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With