Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When / How does Linux load shared libraries into address space?

My question is the following:

When is the address of shared objects specified in programs? During linking? Loading? If I wanted to find the memory address of the system command inside of libc inside of my program I could find it easily in gdb, but what if I don't want to bring the program into a debugger?

Could this address change from run to run? Are there any other static analysis tool that will allow be to view where libraries or functions will be loaded into this program's memory space when run?

EDIT: I want this information outside of the program (ie. using utilities like objdump to gather information)

like image 343
Ryan Avatar asked Feb 27 '11 00:02

Ryan


People also ask

How are shared libraries loaded in Linux?

Shared libraries are the most common way to manage dependencies on Linux systems. These shared resources are loaded into memory before the application starts, and when several processes require the same library, it will be loaded only once on the system. This feature saves on memory usage by the application.

How is shared library loaded?

Static Libraries are linked into a compiled executable (or another library). After the compilation, the new artifact contains the static library's content. Shared Libraries are loaded by the executable (or other shared library) at runtime.

How does Linux find shared libraries?

In Linux, shared libraries are stored in /lib* or /usr/lib*. Different Linux distributions or even versions of the same distribution might package different libraries, making a program compiled for a particular distribution or version not correctly run on another.

Where does Linux install shared libraries?

According to the FHS, most libraries should be installed in /usr/lib, but libraries required for startup should be in /lib and libraries that are not part of the system should be in /usr/local/lib.


1 Answers

Libraries are loaded by ld.so (dynamic linker or run-time linker aka rtld, ld-linux.so.2 or ld-linux.so.* in case of Linux; part of glibc). It is declared as "interpreter" (INTERP; .interp section) of all dynamic linked ELF binaries. So, when you start program, Linux will start an ld.so (load into memory and jump to its entry point), then ld.so will load your program into memory, prepare it and then run it. You can also start dynamic program with

 /lib/ld-linux.so.2 ./your_program your_prog_params 

ld.so does an actual open and mmap of all needed ELF files, both ELF file of your program and ELF files of all neeeded libraries. Also, it fills GOT and PLT tables and does relocations resolving (it writes addresses of functions from libraries to call sites, in many cases with indirect calls).

The typical load address of some library you can get with ldd utility. It is actually a bash script, which sets a debug environment variable of ld.so (actually LD_TRACE_LOADED_OBJECTS=1 in case of glibc's rtld) and starts a program. You even can also do it yourself without needs of the script, e.g. with using bash easy changing of environment variables for single run:

 LD_TRACE_LOADED_OBJECTS=1 /bin/echo 

The ld.so will see this variable and will resolve all needed libraries and print load addresses of them. But with this variable set, ld.so will not actually start a program (not sure about static constructors of program or libraries). If the ASLR feature is disabled, load address will be the same most times. Modern Linuxes often has ASLR enabled, so to disable it, use echo 0 | sudo tee /proc/sys/kernel/randomize_va_space.

You can find offset of system function inside the libc.so with nm utility from binutils. I think, you should use nm -D /lib/libc.so or objdump -T /lib/libc.so and grep output.

like image 114
osgx Avatar answered Oct 07 '22 21:10

osgx