Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do shared libraries use the same heap as the application?

Tags:

c

linux

gcc

x86-64

Say I have an application in Linux that uses shared libraries (.so files). My question is whether the code in those libraries will allocate memory in the same heap as the main application or do they use their own heap?

So for example, some function in the .so file calls malloc, would it use the same heap manager as the application or another one? Also, what about the global data in those shared memories. Where does it lie? I know for the application it lies in the bss and data segment, but don't know where it is for those shared object files.

like image 725
MetallicPriest Avatar asked Jan 15 '12 01:01

MetallicPriest


People also ask

How does shared library work?

Simply put, A shared library/ Dynamic Library is a library that is loaded dynamically at runtime for each application that requires it. Dynamic Linking doesn't require the code to be copied, it is done by just placing name of the library in the binary file.

Is shared library same as dynamic library?

Dynamic and shared libraries are usually the same. But in your case, it looks as if you are doing something special. In the shared library case, you specify the shared library at compile-time. When the app is started, the operating system will load the shared library before the application starts.

Are shared libraries shared between processes?

In short, only code is shared among processes, not data.

Are shared libraries binary?

A shared object (also called a library) is a binary (usually not directly executable) used by multiple programs/applications on a Linux instance.


2 Answers

My question is whether the code in those libraries will allocate memory in the same heap as the main application or do they use their own heap?

If the library uses the same malloc/free as the application (e.g. from glibc) - then yes, program and all libraries will use the single heap.

If library uses mmap directly, it can allocate memory which is not the memory used by program itself.

So for example, some function in the .so file calls malloc, would it use the same heap manager as the application or another one?

If function from .so calls malloc, this malloc is the same as malloc called from program. You can see symbol binding log in Linux/glibc (>2.1) with

 LD_DEBUG=bindings ./your_program 

Yes, several instances of heap managers (with default configuration) can't co-exist without knowing about each other (the problem is with keeping brk-allocated heap size synchronized between instances). But there is a configuration possible when several instances can co-exist.

Most classic malloc implementations (ptmalloc*, dlmalloc, etc) can use two methods of getting memory from the system: brk and mmap. Brk is the classic heap, which is linear and can grow or shrink. Mmap allows to get lot of memory in anywhere; and you can return this memory back to the system (free it) in any order.

When malloc is builded, the brk method can be disabled. Then malloc will emulate linear heap using only mmaps or even will disable classic linear heap and all allocations will be made from discontiguous mmaped fragmens.

So, some library can have own memory manager, e.g. malloc compiled with brk disabled or with non-malloc memory manager. This manager should have function names other than malloc and free, for example malloc1 and free1 or should not to show/export this names to dynamic linker.

Also, what about the global data in those shared memories. Where does it lie? I know for the application it lies in the bss and data segment, but don't know where it is for those shared object files.

You should think both about program and .so just as ELF files. Every ELF file has "program headers" (readelf -l elf_file). The way how data is loaded from ELF into memory depends on program header's type. If the type is "LOAD", corresponding part of file will be privately mmaped (Sic!) to memory. Usually, there are 2 LOAD segments; first one for code with R+X (read+execute) flags and second is for data with R+W (read+write) flags. Both .bss and .data (global data) sections are placed in the segment of type LOAD with Write enabled flag.

Both executable and shared library has LOAD segments. Some of segments has memory_size > file_size. It means that segment will be expanded in memory; first part of it will be filled with data from ELF file, and the second part of size (memory_size-file_size) will be filled with zero (for *bss sections), using mmap(/dev/zero) and memset(0)

When Kernel or Dynamic linker loads ELF file into memory, they will not think about sharing. For example, you want to start same program twice. First process will load read-only part of ELF file with mmap; second process will do the same mmap (if aslr is active - second mmap will be into different virtual address). It is task of Page cache (VFS subsystem) to keep single copy of data in physical memory (with COPY-on-WRITE aka COW); and mmap will just setup mappings from virtual address in each process into single physical location. If any process will change a memory page; it will be copied on write to unique private physical memory.

Loading code is in glibc/elf/dl-load.c (_dl_map_object_from_fd) for ld.so and linux-kernel/fs/binfmt_elf.c for kernel's ELF loader (elf_map, load_elf_binary). Do a search for PT_LOAD.

So, global data and bss data is always privately mmaped in each process, and they are protected with COW.

Heap and stack are allocated in run-time with brk+mmap (heap) and by OS kernel automagically in brk-like process (for stack of main thread). Additional thread's stacks are allocated with mmap in pthread_create.

like image 164
osgx Avatar answered Oct 14 '22 11:10

osgx


Symbol tables are shared across an entire process in Linux. malloc() for any part of the process is the same as all the other parts. So yes, if all parts of a process access the heap via malloc() et alia then they will share the same heap.

like image 26
Ignacio Vazquez-Abrams Avatar answered Oct 14 '22 10:10

Ignacio Vazquez-Abrams