Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing shared object (.so file) while main program is running

Tags:

I have a shared object gateway.so (in Linux/C). And a.out application is using it.

QUESTION A

I guess: when process a.out starts, the loader loads the gateway.so (I am not using dl functions like dlopen). So all the runtime symbol resolutions to gateway.so will happen in memory. It need not to access gateway.so from disk any more.

Am I right?

So I cannot replace the gateway.so with an updated version, while a.out is running, right?

QUESTION B

Another related question: Once when I substitued and outdated version of the gateway.so file, i got the message

"a.out: can't resolve symbol 'Test_OpenGateway'"

Which program component (loader/linker ...) sends this output ? This component is executing as part of the same process context ?

like image 427
Lunar Mushrooms Avatar asked Oct 14 '11 12:10

Lunar Mushrooms


People also ask

Can shared library have main?

No, the implementation of C library of gcc has an entry point not a main symbol.

How do shared object files work?

A shared object is an indivisible unit that is generated from one or more relocatable objects. Shared objects can be bound with dynamic executables to form a runable process. As their name implies, shared objects can be shared by more than one application.

What is .so file in C++?

The SO file stands for Shared Library. You compile all C++ code into the.SO file when you write it in C or C++. The SO file is a shared object library that may be dynamically loaded during Android runtime.

Where are shared libraries loaded?

Shared Libraries are loaded by the executable (or other shared library) at runtime.


1 Answers

Question A

You can replace the library while an application is using it, if you do it the right way.

Before we get there lets have a look at the main program binary. Here is an example program:

#include <unistd.h>

void justsit(void) {
  for (;;) {
    sleep(1);
  }
}

int main(int argc, char **argv) {
  printf("My PID is %d\n", getpid());
  justsit();
  return 0;
}

Compile and start it:

$ gcc -Wall -o example example.c
$ ./example
My PID is 4339

Now it will just sit there, so open a new terminal to do this:

$ gcc -Wall -o example-updated example.c
$ cp example-updated example
cp: cannot create regular file `example': Text file busy

What happened now? The kernel refused changing file example because it has a process that is running that file.

Now lets try to remove it:

$ rm example

What? That worked? Why can the file be removed, but not replaced? Yes, or rather, the file was not really removed, just the "name", the kernel tells the filesystem to keep the contents of the file. When nothing has the file open any longer the contents are also removed. (dentry is removed immediately and but inode is freed when it has no users as filesystem people would say)

This can sort of be seen in /proc: (this is why the program prints its PID so you can easily check this)

$ readlink /proc/4339/exe
/tmp/t/example (deleted)

Anyhow. The fact that it works like this means that one can safely upgrade a program by removing the old binary and putting the new one in same place. There is a program to handle this: install(1).

Ok, back to your question - shared objects.

Let's split the example into two parts, main.c and shared.c:

/* main.c */
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

void justsit(void);

int main(int argc, char **argv) {
  printf("My PID is %d\n", getpid());
  justsit();
  return 0;
}

and

/* shared.c */
#include <stdio.h>
#include <unistd.h>

void justsit(void) {
  for (;;) {
    sleep(1);
  }
}

Compile them like this:

$ gcc -Wall --shared -o libshared.so shared.c 
$ gcc -Wall -L. -o main main.c -lshared

Now hopefully if we try to replace libshared.so we would get a similar "Text file busy" error? Lets see. First start the main program - current directory is not in lib search path so tell dynamic linker to search there:

$ LD_LIBRARY_PATH=. ./main 
My PID is 5697

Go to a different terminal and replace the library with something obviously broken:

$ echo "junk" > libshared.so 
$

First - it wasn't refused like replacing the program binary. And in the other terminal something interesting happened, the program stopped running with the following error message:

Segmentation fault
$

So it is NOT forbidden to replace a library in use by a program! But as seen from the example above it can have disastrous consequences.

Luckily the same "trick" that was used to replace a running binary can be used to replace a lib in use. Restart the main program (don't forget to recompile libshared.so too as that was replaced by junk) and see how it is safe to do rm on the library. /proc/PID/maps can be inspected to see what shared objects the process is using:

$ cat /proc/5733/maps  | grep libshared.so
008a8000-008a9000 r-xp 00000000 08:01 2097292    /tmp/t/libshared.so
008a9000-008aa000 r--p 00000000 08:01 2097292    /tmp/t/libshared.so
008aa000-008ab000 rw-p 00001000 08:01 2097292    /tmp/t/libshared.so
$ rm libshared.so 
$ cat /proc/5733/maps  | grep libshared.so
008a8000-008a9000 r-xp 00000000 08:01 2097292    /tmp/t/libshared.so (deleted)
008a9000-008aa000 r--p 00000000 08:01 2097292    /tmp/t/libshared.so (deleted)
008aa000-008ab000 rw-p 00001000 08:01 2097292    /tmp/t/libshared.so (deleted)

The main program continues to run fine. Again this is because just the name (dentry) was removed from disk, not the actual contents (inode). After the removal it is safe to create a new file with the name libshared.so without affecting the running program.

So, to summarize - just use the install command to install programs and binaries.

Question B

Yes, that is printed by the dynamic linker, in userspace.

#include <stdio.h>
#include <unistd.h>

int main(int argc, char **argv) {
    execl("./main", "main", NULL);
    printf("exec failed?\n");
    return 0;
}

Compile it with gcc -Wall -o execit execit.c. Remember that execl replaces the current process with the specified command.

$ ./execit 
main: error while loading shared libraries: libshared.so: cannot open shared object file: No such file or directory
$ rm main
$ ./execit 
exec failed?

What happened and what does it tell us? First there is error while loading shared libraries without exec failed?. No "exec failed" suggests that the process was successfully replaced. This means that the kernel transfered control to the dynamic linker which failed. After "main" was removed it fails early and the process is not replaced.

like image 190
Anders Waldenborg Avatar answered Oct 10 '22 18:10

Anders Waldenborg