Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does adding a private member variable break C++ ABI compatibility?

The pimpl idiom is commonly used in order to allow changing code in dynamically linked libraries without breaking ABI compatibility and having to recompile all the code that depends on the library.

Most of the explanations I see mention that adding a new private member variable changes the offsets of public and private members in the class. That makes sense to me. What I don't understand is how in practice this actually breaks the dependent libraries.

I've done a lot of reading on ELF files and how dynamic linking actually works, but I still don't see how changing the class size in the shared lib would break things.

E.g. Here is a test application (a.out) I wrote that uses code (Interface::some_method) from a test shared library (libInterface.so):

aguthrie@ana:~/pimpl$ objdump -d -j .text a.out 
08048874 <main>:
...
 8048891:   e8 b2 fe ff ff          call   8048748 <_ZN9Interface11some_methodEv@plt>

The call to some_method uses the Procedural Linkage Table (PLT):

aguthrie@ana:~/pimpl$ objdump -d -j .plt a.out 

08048748 <_ZN9Interface11some_methodEv@plt>:
 8048748:   ff 25 1c a0 04 08       jmp    *0x804a01c
 804874e:   68 38 00 00 00          push   $0x38
 8048753:   e9 70 ff ff ff          jmp    80486c8 <_init+0x30>

which subsequently goes to the Global Offset Table (GOT) where address 0x804a01c is contained:

aguthrie@ana:~/pimpl$ readelf -x 24 a.out 

Hex dump of section '.got.plt':
  0x08049ff4 089f0408 00000000 00000000 de860408 ................
  0x0804a004 ee860408 fe860408 0e870408 1e870408 ................
  0x0804a014 2e870408 3e870408 4e870408 5e870408 ....>...N...^...
  0x0804a024 6e870408 7e870408 8e870408 9e870408 n...~...........
  0x0804a034 ae870408                            ....

And then this is where the dynamic linker works its magic and looks through all the symbols contained in the shared libs in LD_LIBRARY_PATH, finds Interface::some_method in libInterface.so and loads its code into the GOT so on subsequent calls to some_method, the code in the GOT is actually the code segment from the shared library.

Or something along those lines.

But given the above, I still don't understand how the shared lib's class size or its method offsets come into play here. As far as I can tell, the steps above are agnostic to the class size. It looks like only the symbol name of the method in the library is included in a.out. Any changes in class size should just be resolved at runtime when the linker loads the code into the GOT, no?

What am I missing here?

like image 785
adg Avatar asked Oct 08 '11 19:10

adg


People also ask

What is ABI in c++?

As C++ evolved over the years, the Application Binary Interface (ABI) used by a compiler often needed changes to support new or evolving language features. Consequently, programmers were expected to recompile all their binaries with every new compiler release.


1 Answers

The main problem is that, when you allocate a new instance of a class (either on the stack, or via new), the calling code needs to know the size of the object. If you later change the size of the object (by adding a private member), this increases the size needed; however your callers are still using the old size. So you end up not allocating enough space to hold the object, and the object's constructor then proceeds to corrupt the stack (or heap), because it assumes it has enough space.

Additionally, if you have any inline member functions, their code (including offsets to member variables) may be inlined into the calling code. If you add private members anywhere other than the end, these offsets will be incorrect, also leading to memory corruption (note: even if you add to the end, the size mismatch is still a problem).

like image 69
bdonlan Avatar answered Oct 12 '22 14:10

bdonlan