Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Address space for shared libraries loaded multiple times in the same process

First off, I've already found a few references which might answer my question. While I plan on reading them soon (i.e. after work), I'm still asking here in case the answer is trivial and does not require too much supplementary knowledge.

Here is the situation: I am writing a shared library (let's call it libA.so) which needs to maintain a coherent internal (as in, static variables declared in the .c file) state within the same process. This library will be used by program P (i.e. P is compiled with -lA). If I understand everything so far, the address space for P will look something like this:

 ______________
| Program P    |
| <            |
|  variables,  |
|  functions   |
|  from P      |
| >            |
|              |
| <            |
|  libA:       |
|  variables,  |
|  functions   |
|  loaded (ie  |
|  *copied*)   |
|  from shared |
|  object      |
| >            |
| <            |
|  stuff from  |
|  other       |
|  libraries   |
| >            |
|______________|

Now P will sometimes call dlopen("libQ.so", ...). libQ.so also uses libA.so (i.e. was compiled with -lA). Since everything happens within the same process, I need libA to somehow hold the same state whether the calls come from P or Q.

What I do not know is how this will translate in memory. Will it look like this:

 ______________
| Program P    |
| <            |
|  P stuff     |
| >            |
|              |
| <            |
|  libA stuff, |
|  loaded by P |
| >            | => A's code and variables are duplicated
|              |
| <            |
|  libQ stuff  |
|  <           |
|   libA stuff,|
|   loaded by Q|
|  >           |
| >            |
|______________|

... or like this?

 ______________
| Program P    |
| <            |
|  P stuff     |
|  *libA       |
|  *libQ       |
| >            |
|              |
| <            |
|  libA stuff, |
|  loaded by P |
| >            | => A's code is loaded once, Q holds some sort of pointer to it
|              |
| <            |
|  libQ stuff  |
|  *libA       |
| >            |
|______________|

In the second case, keeping a consistent state for a single process is trivial; in the first case, it will require some more effort (e.g. some shared memory segment, using the process id as the second argument to ftok()).

Of course, since I have a limited knowledge on how linking and loading works, the diagrams above may be completely wrong. For all I know, the shared libraries could be at a fixed space in memory, and every process accesses the same data and code. The behaviour could also depends on how A and/or P and/or Q were compiled. And this behaviour is probably not platform independent.

like image 603
Peniblec Avatar asked Mar 20 '23 00:03

Peniblec


1 Answers

The code segment of a shared library exists in memory in a single instance per system. Yet, it can be mapped to different virtual addresses for different processes, so different processes see the same function at different addresses (that's why the code that goes to a shared library must be compiled as PIC).

Data segment of a shared library is created in one copy for each process, and initialized by whatever initial values where specified in the library.

This means that the callers of a library do not need to know if it is shared or not: all callers in one process see the same copy of the functions and the same copy of external variables defined in the library.

Different processes execute the same code, but have their individual copies of data, one copy per process.

like image 185
crosser Avatar answered Apr 27 '23 02:04

crosser