Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the ELF header differences between an ELF object file and shared object?

Tags:

First of all, I'm asking this from a technical perspective, not a perspective of the user of library code. One example of a difference is that shared objects contain program headers and ordinary object files don't. What are the other differences?

As to the purpose of my question, I'm trying to figure out what content would need to be removed from a shared object file to have the linker treat it as an ordinary object file and attempt to relocate and static link it into the generated executable file, rather than identifying it as a shared library and generating a DT_NEEDED reference. This in turn is the first step to primitive "conversion" of a shared library to something that can be statically linked (further work to make the relocations satisfiable may be required, however).

like image 842
R.. GitHub STOP HELPING ICE Avatar asked Jul 08 '11 13:07

R.. GitHub STOP HELPING ICE


People also ask

What is ELF file header?

The ELF header defines whether to use 32-bit or 64-bit addresses. The header contains three fields that are affected by this setting and offset other fields that follow them. The ELF header is 52 or 64 bytes long for 32-bit and 64-bit binaries respectively.

Is an object file an ELF file?

Introduction. This chapter describes the object file format, called ELF (Executable and Linking Format). There are three main types of object files. A relocatable file holds code and data suitable for linking with other object files to create an executable or a shared object file.

What is a shared object file?

A shared library or shared object is a file that is intended to be shared by multiple programs. Symbols used by a program are loaded from shared libraries into memory at load time or runtime.

What does an ELF file contains?

An elf file contains the bin information but it is surrounded by lots of other information, possible debug info, symbols, can distinguish code from data within the binary.


1 Answers

One of the major differences you'll find is that, during the final link stage, a number of C library components are statically linked into the library, forming the INIT and FINI symbols among other things. These are specified with DT_INIT and DT_FINI entries in the program header; you will need to transform these into static constructor/destructor entries. DT_NEEDED entries will be lost in a transformation into a .o; you will need to re-add them manually.

The PLT generated in the final link stage needs to be either merged with the final output file, or transformed back into ordinary relocations; this is non-trivial, as the PLT is just code. The GOT is also an issue; it's located at a fixed relative offset from the .text segment, and contains pointers to data members. It also, however, contains a pointer to the _DYNAMIC structure, of which there can only be one per library or executable. And you can't change offsets in the GOT, because they're referenced directly from code.

So it's quite difficult to convert a .so to a true .o again; information has been lost in the conversion to PLT/GOTs. A better approach might be to alter the dynamic linker in the C library to support linking a shared library that's already mapped in memory as a static image. That is, you'd convert the .so to a .o simply by converting it to a page-aligned read-only section; then pass this to the dynamic linker to remap with appropriate permissions and perform normal shared library initialization. Then add a static constructor to call into the C library to initialize the shared library. Finally, add appropriate exported symbols to correspond to dynamic symbols in the shared library's .text segment.

One problem with this approach, though, is that static constructors might run before the static constructor that initializes your fake solib. In this case, they must not attempt to call functions from the solib, or you'll probably crash, as the solib is not yet initialized. This could potentially be avoided by making the exported symbols point into a trampoline function that ensures the solib is initialized first (not so easy with data symbols, though!)

You might also find that this previous question might be of some use to you.

like image 190
bdonlan Avatar answered Oct 25 '22 12:10

bdonlan