Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does passing objects and calling member functions between libraries work?

Tags:

c++

gcc

qt

I am trying to understand what happens, when I include a class file of the core application to the compilation of a library (Qt-Plugin). Assume I have a plugin - a handler - and a Query(h,cpp) (with private implementation) - the object to be handled.

Edit

query.h (from link)

class Query final
{
public:
    friend class ExtensionManager;

    Query(const QString &term);
    ~Query();
    void addMatch(shared_ptr<AlbertItem> item, short score = 0);
    void reset();
    void setValid(bool b = true);
    bool isValid();
private:
    QueryPrivate *impl;
};

I presumed that the compiler, at least at the linking stage, takes the object file and puts it into the shared object file. But actually the name query does not appear in the output of the cmake compilation and linking process(essentially the g++ commands executed), just the includes of its directory.

When I compile the plugin/library does the compiler/linker do anything else but checking the interface/header? How can the plugin know anything about the Query at runtime? How does the pluging call functions on an object at runtime?

like image 571
ManuelSchneid3r Avatar asked Oct 19 '22 08:10

ManuelSchneid3r


1 Answers

How can the plugin know about the query at runtime?

Sharing information between different compilation units (dlls, shared objects, executables), is a problematic piece of design.

  • There is no standard for a C++ ABI. That allows different compiler providers to layout their objects (e.g. where the vtable is, where the virtual destructor is in the vtable), and how to call methods differently even on the same machine.
  • The .h file is a weak interface definition, and suffers that #define's may be different between different compilers of the same thing. (e.g. Microsoft debug STL does not work with release STL).
  • inline functions stored in the .h may result in different implementations being called between the library and the plugin.
  • Memory management may be compromised as the code freeing an object may not understand where and how it was allocated.

Modifying the data

Assuming a class has public members, (and both modules share a compiler) these can be modified in the library which created the object and the library which implemented it.

class Example1 {
    public:
      int value1;
};

in executable.

example1.value1 = 12;

in plugin

if( this->value1 == 12 ){
}

This does not work for complex objects e.g. std::string.

Calling functions

class Example2 {
      public:
      void AFunction();
};

Any caller of AFunction needs an implementation available. This will be called statically, and may be shared between the binary and the shared-object

 +-------------------+          +-----------------------+
 | binary            |          | shared object         |
 | Query::AFunction()|          |                       |
 | {                 |          |  Process( Query &q )  |
 | }                 |          |  {                    |
 |                   |    o-->  |     q.AFunction();    | <<< may be in
 | main()            |    |     |                       | shared object
 | {                 |    |     |                       | could call binary
 |    Query q;       |    |     |                       |
 |    Process( q );  | ===o     |                       |
 +-------------------+          +-----------------------+

If the shared object had an implementation (it was an inline function, or the query.cpp was included in the shared-object makefile), then the implementation of AFunction may be distinct.

**with STL - both binaries will have their own implementation, which if they are compiled at different times, may be different (and incompatible). **

The behavior of a shared object is such that if it has unresolved externals, which are satisfied by the binary which is loading it, it will use their implementation. This is not true on windows, and windows behavior can be generated using -z, defs.

In order to call a non-virtual function, the caller needs to know about the class at compile time. The method is a fixed call, with the first (generally) parameter being the this pointer. Thus to generate the code, the compiler calls directly (or through a fix-up table) the function.

Calling virtual functions

Virtual functions are always called through the this pointer, which means that a virtual function for a class is 'chosen' by the code which constructs the object. This is used in Windows for COM implementations, and is a useful technique for object sharing - allows new classes with different functionality to be delivered after a framework's compilation, yet without any knowledge call the implementation object.

The vtable needs to be stable for this to work. The base class, or interface should be the same when the caller and the callee are compiled for this all to work.

When designing a library, it is possible to produce an interface object.

class ICallback {
     virtual void Funcion1( class MyData * data ) = 0;
};

When the library is being compiled, it does not know what implements ICallback and any of its functions, but it does know how to call those.

So a function definition

class Plugin {
     bool Process( ICallback * pCallback );
};

Allows a function to be declared and implemented, without knowing the implementation of the callback (ICallback). This does not create an unresolved symbol, nor does it require that the plugin knows about the item before the plugin is compiled. All it requires, is that its caller ( m_pluginObject.Process( &myQueryImplementation ); ) has a concrete type created to pass in.

Compilation

When a compiler compiles code, it creates an object file (.obj for windows and .o on unix).

Within this file, is all the code and data definitions required to link the file.

Notional object file

<dictionary>
    int SomeIntValue = Address1
    bool Class1::SomeFunction( char * value ) = Address2
</dictionary>
<Requires>
    std::ostream::operator<<( const char *);
    std::cout
</Requires>
<Data>
      Address1 : SomeIntValue = 12
</Data>
<Code>
    Address2 .MangledSomeFunctionCharStarBool
                  // some assembly
          call ostream::operator<<(char*)
</Code>

This objecf file should have sufficient information within it to satisfy a part of the compilation process. Whilst normally a file such as MyClass.cc may have all of the functions needed to implement MyClass, it does not need to have all of these things.

When the compiler is reading a header file, or any class declarations, it is creating a list of unresolved externals which it will need later.

 class Class1 {
       int ClassData;
    public:
        bool SomeFunction( char * value);
        ....
 };

Describes that there is a member function of Class1 which accepts char * as a value, and that the return value will be a bool. When contuing to compile a C++ program, this unresolved function may be implemented when the compiler sees such as

  bool Class1::SomeFunction( char * value )
  {
     bool success = false;
     cout << value;
       // some work
      return success;
  }  

This implemented function is added to the dictionary of what is implemented, and the functions and data it needs are added to the requirements.

Library files

A library file is slightly different on unix and windows. Originally the unix library file was a container of .o files. These were simply the concatenated items (ar) of the .o. Then in order to find the correct items, the library was indexed (ranlib) to produce a working library. More recently I believe the standard of an archive has changed, but the concepts have to remain.

link library

In windows a link library is created when building a DLL, in unix, the link library is built into the shared-object.

The link library is a list of the deliverables from the dynamically loaded object and the name of the .dll, .so which delivers it. This results in information being added to the binary such as :-

<SharedObjects>
     printf : glibc:4.xx
</SharedObjects>

Describing the shared objects which are needed to be loaded, and the functions that they provide (the subset for this program).

Linking

When the compiler is producing a binary (.so, .dll, .exe or unix binary), then the object files specified on the command line are bound into the binary. This creates a set of implemented functions (e.g. main), and a set of unresolved requirements.

Each library (.a, .lib) is then searched to see if they offer the functions required to make a complete process. If they do offer any function, then this is treated as resolved. The single object file which implements the resolved function is completely added to the binary.

They may also have requirements, and these are :-

  1. Resolved by the already loaded binary
  2. Added to the unresolved values.

Note here, that the order of libraries is important, as only parts of the library required are added to the binary.

On windows if this process succeeds, then all the functions required have been added.

On unix, you may need to pass -z,defs SO : unresolved externals. This allows a unix .so to have some of its requirements to be satisfied by the loading binary, but can result in an incomplete binary.

In summary

A binary has :-

  1. All the object files from the link command line.
  2. Any of the object files from static libraries required to satisfy unresolved externals
  3. A list of shared objects and their functions required to deliver the working program.
  4. Using interfaces and base classes, allows new classes to be added after the original design has completed.
like image 73
mksteve Avatar answered Nov 14 '22 04:11

mksteve