I want to intercept all file system access that occurs inside of dlopen(). At first, it would seem like <code>LD_PRELOAD</code> or <code>-Wl,-wrap,</code> would be viable solutions, but I have had trouble making them work due to some technical reasons: <ul> <li>ld.so has already mapped its own symbols by the time LD_PRELOAD is processed. It's not critical for me to intercept the initial loading, but the <code>_dl_*</code> worker functions are resolved at this time, so future calls go through them. I think <code>LD_PRELOAD</code> is too late.</li> <li>Somehow <code>malloc</code> circumvents the issue above because the <code>malloc()</code> inside of ld.so does not have a functional <code>free()</code>, it just calls <code>memset()</code>.</li> <li>The file system worker functions, e.g. <code>__libc_read()</code>, contained in <code>ld.so</code> are static so I can't intercept them with <code>-Wl,-wrap,__libc_read</code>.</li> </ul> This might all mean that I need to build my own <code>ld.so</code> directly from source instead of linking it into a wrapper. The challenge there is that both <code>libc</code> and <code>rtld-libc</code> are built from the same source. I know that the macro <code>IS_IN_rtld</code> is defined when building <code>rtld-libc</code>, but how can I guarantee that there is only one copy of static data structures while still exporting a public interface function? (This is a glibc build system question, but I haven't found documentation of these details.) Are there any better ways to get inside <code>dlopen()</code>? Note: I can't use a Linux-specific solution like <code>FUSE</code> because this is for minimal "compute-node" kernels that do not support such things.

<blockquote> it would seem like LD_PRELOAD or -Wl,-wrap, would be viable solutions </blockquote> The <code>--wrap</code> solution could not possibly be viable: it works only at (static) link time, and your <code>ld.so</code> and <code>libc.so.6</code> and <code>libdl.so.2</code> have all already been linked, so now it is too late to use <code>--wrap</code>. The <code>LD_PRELOAD</code> could have worked, except ... ld.so considers the fact that <code>dlopen()</code> calls <code>open()</code> an internal implementation detail. As such, it just calls the internal <code>__open</code> function, bypassing <code>PLT</code>, and your ability to interpose <code>open</code> with it. <blockquote> Somehow malloc circumvents the issue </blockquote> That's because <code>libc</code> supports users who implement their own <code>malloc</code> (e.g. for debugging purposes). So the call to e.g. <code>calloc</code> from <code>dlopen</code> does go through <code>PLT</code>, and is interposable via <code>LD_PRELOAD</code>. <blockquote> This might all mean that I need to build my own ld.so directly from source instead of linking it into a wrapper. </blockquote> What will the rebuilt <code>ld.so</code> do? I think you want it to call <code>__libc_open</code> (in <code>libc.so.6</code>), but that can't possibly work for obvious reason: it is <code>ld.so</code> that <code>open</code>s <code>libc.so.6</code> in the first place (at process startup). You could rebuild <code>ld.so</code> with the call to <code>__open</code> replaced with a call to <code>open</code>. That will cause <code>ld.so</code> to go through <code>PLT</code>, and expose it to <code>LD_PRELOAD</code> interposition. If you go that route, I suggest that you don't overwrite the system <code>ld.so</code> with your new copy (the chance of making a mistake and rendering the system unbootable is just too great). Instead, install it to e.g. <code>/usr/local/my-ld.so</code>, and then link your binaries with <code>-Wl,--dynamic-linker=/usr/local/my-ld.so</code>. Another alternative: runtime patching. This is a bit of a hack, but you can (once you gain control in main) simply scan the <code>.text</code> of <code>ld.so</code>, and look for <code>CALL __open</code> instructions. If <code>ld.so</code> is not stripped, then you can find both the internal <code>__open</code>, and the functions you want to patch (e.g. <code>open_verify</code> in <code>dl-load.c</code>). Once you find the interesting <code>CALL</code>, <code>mprotect</code> the page that contains it to be writable, and patch in the address of your own interposer (which can in turn call <code>__libc_open</code> if it needs to), then <code>mprotect</code> it back. Any future <code>dlopen()</code> will now go through your interposer.

How to intercept file system access inside dlopen()?

Tags:

glibc

dlopen

I want to intercept all file system access that occurs inside of dlopen(). At first, it would seem like LD_PRELOAD or -Wl,-wrap, would be viable solutions, but I have had trouble making them work due to some technical reasons:

ld.so has already mapped its own symbols by the time LD_PRELOAD is processed. It's not critical for me to intercept the initial loading, but the _dl_* worker functions are resolved at this time, so future calls go through them. I think LD_PRELOAD is too late.
Somehow malloc circumvents the issue above because the malloc() inside of ld.so does not have a functional free(), it just calls memset().
The file system worker functions, e.g. __libc_read(), contained in ld.so are static so I can't intercept them with -Wl,-wrap,__libc_read.

This might all mean that I need to build my own ld.so directly from source instead of linking it into a wrapper. The challenge there is that both libc and rtld-libc are built from the same source. I know that the macro IS_IN_rtld is defined when building rtld-libc, but how can I guarantee that there is only one copy of static data structures while still exporting a public interface function? (This is a glibc build system question, but I haven't found documentation of these details.)

Are there any better ways to get inside dlopen()?

Note: I can't use a Linux-specific solution like FUSE because this is for minimal "compute-node" kernels that do not support such things.

588

asked Oct 08 '11 20:10

Jed

1 Answers

it would seem like LD_PRELOAD or -Wl,-wrap, would be viable solutions

The --wrap solution could not possibly be viable: it works only at (static) link time, and your ld.so and libc.so.6 and libdl.so.2 have all already been linked, so now it is too late to use --wrap.

The LD_PRELOAD could have worked, except ... ld.so considers the fact that dlopen() calls open() an internal implementation detail. As such, it just calls the internal __open function, bypassing PLT, and your ability to interpose open with it.

Somehow malloc circumvents the issue

That's because libc supports users who implement their own malloc (e.g. for debugging purposes). So the call to e.g. calloc from dlopen does go through PLT, and is interposable via LD_PRELOAD.

This might all mean that I need to build my own ld.so directly from source instead of linking it into a wrapper.

What will the rebuilt ld.so do? I think you want it to call __libc_open (in libc.so.6), but that can't possibly work for obvious reason: it is ld.so that opens libc.so.6 in the first place (at process startup).

You could rebuild ld.so with the call to __open replaced with a call to open. That will cause ld.so to go through PLT, and expose it to LD_PRELOAD interposition.

If you go that route, I suggest that you don't overwrite the system ld.so with your new copy (the chance of making a mistake and rendering the system unbootable is just too great). Instead, install it to e.g. /usr/local/my-ld.so, and then link your binaries with -Wl,--dynamic-linker=/usr/local/my-ld.so.

Another alternative: runtime patching. This is a bit of a hack, but you can (once you gain control in main) simply scan the .text of ld.so, and look for CALL __open instructions. If ld.so is not stripped, then you can find both the internal __open, and the functions you want to patch (e.g. open_verify in dl-load.c). Once you find the interesting CALL, mprotect the page that contains it to be writable, and patch in the address of your own interposer (which can in turn call __libc_open if it needs to), then mprotect it back. Any future dlopen() will now go through your interposer.

answered Sep 21 '22 18:09

Employed Russian

Related questions
                            
                                compiling glibc from source with debug symbols
                            
                                Swift 2.2 on Linux, "use of unresolved identifier 'exit'"
                            
                                Where's glibc's socket implementation at?
                            
                                Howto use readlink with dynamic memory allocation
                            
                                Relevance of libc.so.6 in Linux kernel [closed]
                            
                                Using glibc, why does my gethostbyname fail after I/DHCP has changed the DNS server?
                            
                                Purpose of __USE_XOPEN2K8 and how to set it?
                            
                                How does the dlsym work?
                            
                                Non-blocking read on pipe
                            
                                libc or glibc in ubuntu?
                            
                                What is the glibc GLRO macro?
                            
                                glibc detected malloc(): memory corruption in C
                            
                                Configuring for a compiler different than the default while running configure
                            
                                How to install 32 bit glibc on 64 bit ubuntu
                            
                                How are the ntoh functions implemented under RHEL/GCC?
                            
                                Glibc vs GCC vs binutils compatibility
                            
                                What do the fields in the output from `malloc_info` mean?
                            
                                How can I replay a multithreaded application?
                            
                                ChromeDriver 2.31 not working in CentOS / RHEL 7 (gilbc 2.18 required)
                            
                                How long does Glibc take to compile?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With