Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamic Libraries, plugin frameworks, and function pointer casting in c++

I am trying to create a very open plugin framework in c++, and it seems to me that I have come up with a way to do so, but a nagging thought keeps telling me that there is something very, very wrong with what I am doing, and it either won't work or it will cause problems.

The design I have for my framework consists of a Kernel that calls each plugin's init function. The init function then turns around and uses the Kernel's registerPlugin and registerFunction to get a unique id and then register each function the plugin wants to be accessible using that id, respectively.

The function registerPlugin returns the unique id. The function registerFunction takes that id, the function name, and a generic function pointer, like so:

bool registerFunction(int plugin_id, string function_name, plugin_function func){}

where plugin_function is

typedef void (*plugin_function)();

The kernel then takes the function pointer and puts it in a map with the function_name and plugin_id. All plugins registering their function must caste the function to type plugin_function.

In order to retrieve the function, a different plugin calls the Kernel's

plugin_function getFunction(string plugin_name, string function_name);

Then that plugin must cast the plugin_function to its original type so it can be used. It knows (in theory) what the correct type is by having access to a .h file outlining all the functions the plugin makes available. Plugins, by the by, are implemented as dynamic libraries.

Is this a smart way to accomplish the task of allowing different plugins to connect with each other? Or is this a crazy and really terrible programming technique? If it s, please point me in the direction of the correct way to accomplish this.

EDIT: If any clarification is needed, ask and it will be provided.

like image 336
MirroredFate Avatar asked Feb 09 '13 00:02

MirroredFate


2 Answers

Function pointers are strange creatures. They're not necessarily the same size as data pointers, and hence cannot be safely cast to void* and back. But, the C++ (and C) specifications allow any function pointer to be safely cast to another function pointer type (though you have to later cast it back to the earlier type before calling it if you want defined behaviour). This is akin to the ability to safely cast any data pointer to void* and back.

Pointers to methods are where it gets really hairy: a method pointer might be larger than a normal function pointer, depending on the compiler, whether the application is 32- or 64-bit, etc. But even more interesting is that, even on the same compiler/platform, not all method pointers are the same size: Method pointers to virtual functions may be bigger than normal method pointers; if multiple inheritance (with e.g. virtual inheritance in the diamond pattern) is involved, the method pointers can be even bigger. This varies with compiler and platform too. This is also the reason that it's difficult to create function objects (that wrap arbitrary methods as well as free functions) especially without allocating memory on the heap (it's just possible using template sorcery).

So, by using function pointers in your interface, it becomes unpractical for the plugin authors to pass back method pointers to your framework, even if they're using the same compiler. This might be an acceptable constraint; more on this later.

Since there's no guarantee that function pointers will be the same size from one compiler to the next, by registering function pointers you're limiting the plugin authors to compilers that implement function pointers having the same size as your compiler does. This wouldn't necessarily be so bad in practice, since function pointer sizes tend to be stable across compiler versions (and may even be the same for multiple compilers).

The real problems start to arise when you want to call the functions pointed to by the function pointers; you can't safely call the function at all if you don't know its true signature (you will get poor results ranging from "not working" to segmentation faults). So, the plugin authors would be further limited to registering only void functions that take no parameters.

It gets worse: the way a function call actually works at the assembler level depends on more than just the signature and function pointer size. There's also the calling convention, the way exceptions are handled (the stack needs to be properly unwound when an exception is thrown), and the actual interpretation of the bytes of function pointer (if it's larger than a data pointer, what do the extra bytes signify? In what order?). At this point, the plugin author is pretty much limited to using the same compiler (and version!) that you are, and needs to be careful to match the calling convention and exception handling options (with the MSVC++ compiler, for example, exception handling is only explicitly enabled with the /EHsc option), as well as use only normal function pointers with the exact signature you define.

All the restrictions so far can be considered reasonable, if a bit limiting. But we're not done yet.

If you throw in std::string (or almost any part of the STL), things get even worse though, because even with the same compiler (and version), there are several different flags/macros that control the STL; these flags can affect the size and meaning of the bytes representing string objects. It is, in effect, like having two different struct declarations in separate files, each with the same name, and hoping they'll be interchangeable; obviously, this doesn't work. An example flag is _HAS_ITERATOR_DEBUGGING. Note that these options can even change between debug and release mode! These types of errors don't always manifest themselves immediately/consistently and can be very difficult to track down.

You also have to be very careful with dynamic memory management across modules, since new in one project may be defined differently from new in another project (e.g. it may be overloaded). When deleting, you might have a pointer to an interface with a virtual destructor, meaning the vtable is needed to properly delete the object, and different compilers all implement the vtable stuff differently. In general, you want the module that allocates an object to be the one to deallocate it; more specifically, you want the code that deallocates an object to have been compiled under the exact same conditions as the code that allocated it. This is one reason std::shared_ptr can take a "deleter" argument when it is constructed -- because even with the same compiler and flags (the only guaranteed safe way to share shared_ptrs between modules), new and delete may not be the same everywhere the shared_ptr can get destroyed. With the deleter, the code that creates the shared pointer controls how it is eventually destroyed too. (I just threw this paragraph in for good measure; you don't seem to be sharing objects across module boundaries.)

All of this is a consequence of C++ having no standard binary interface (ABI); it's a free-for-all, where it is very easy to shoot yourself in the foot (sometimes without realising it).

So, is there any hope? You betcha! You can expose a C API to your plugins instead, and have your plugins also expose a C API. This is quite nice because a C API can be interoperated with from virtually any language. You don't have to worry about exceptions, apart from making sure they can't bubble up above the plugin functions (that's the authors' concern), and it's stable no matter the compiler/options (assuming you don't pass STL containers and the like). There's only one standard calling convention (cdecl), which is the default for functions declared extern "C". void*, in practice, will be the same across all compilers on the same platform (e.g. 8 bytes on x64).

You (and the plugin authors) can still write your code in C++, as long as all the external communication between the two uses a C API (i.e. pretends to be a C module for the purposes of interop).

C function pointers are also likely compatible between compilers in practice, though if you'd rather not depend on this you could have the plugin register a function name (const char*) instead of address, and then you could extract the address yourself using, e.g., LoadLibrary with GetProcAddress for Windows (similarly, Linux and Mac OS X have dlopen and dlsym). This works because name-mangling is disabled for functions declared with extern "C".

Note that there's no direct way around restricting the registered functions to be of a single prototype type (otherwise, as I've said, you can't call them properly). If you need to give a particular parameter to a plugin function (or get a value back), you'll need to register and call the different functions with different prototypes separately (though you could collapse all the function pointers down to a common function pointer type internally, and only cast back at the last minute).

Finally, while you cannot directly support method pointers (which don't even exist in a C API, but are of variable size even with a C++ API and thus cannot be easily stored), you can allow the plugins to supply a "user-data" opaque pointer when registering their function, which is passed to the function whenever it's called; this gives the plugin authors an easy way to write function wrappers around methods and store the object to apply the method to in the user-data parameter. The user-data parameter can also be used for anything else the plugin author wants, which makes your plugin system much easier to interface with and extend. Another example use is to adapt between different function prototypes using a wrapper and extra arguments stored in the user-data.

These suggestions lead to code something like this (for Windows -- the code is very similar for other platforms):

// Shared header
extern "C" {
    typedef void (*plugin_function)(void*);

    bool registerFunction(int plugin_id, const char* function_name, void* user_data);
}

// Your plugin registration code
hModule = LoadLibrary(pluginDLLPath);

// Your plugin function registration code
auto pluginFunc = (plugin_function)GetProcAddress(hModule, function_name);
// Store pluginFunc and user_data in a map keyed to function_name

// Calling a plugin function
pluginFunc(user_data);

// Declaring a plugin function
extern "C" void aPluginFunction(void*);
class Foo { void doSomething() { } };

// Defining a plugin function
void aPluginFunction(void* user_data)
{
    static_cast<Foo*>(user_data)->doSomething();
}

Sorry for the length of this reply; most of it can be summed up with "the C++ standard doesn't extend to interoperation; use C instead since it at least has de facto standards."


Note: Sometimes it's simplest just to design a normal C++ API (with function pointers or interfaces or whatever you like best) under the assumption that the plugins will be compiled under exactly the same circumstances; this is reasonable if you expect all the plugins to be developed by yourself (i.e. the DLLs are part of the project core). This could also work if your project is open-source, in which case everybody can independently choose a cohesive environment under which the project and the plugins are compiled -- but then this makes it hard to distribute plugins except as source code.


Update: As pointed out by ern0 in the comments, it's possible to abstract the details of the module interoperation (via a C API) so that both the main project and the plugins deal with a simpler C++ API. What follows is an outline of such an implementation:

// iplugin.h -- shared between the project and all the plugins
class IPlugin {
public:
    virtual void register() { }
    virtual void initialize() = 0;

    // Your application-specific functionality here:
    virtual void onCheeseburgerEatenEvent() { }
};

// C API:
extern "C" {
    // Returns the number of plugins in this module
    int getPluginCount();

    // Called to register the nth plugin of this module.
    // A user-data pointer is expected in return (may be null).
    void* registerPlugin(int pluginIndex);

    // Called to initialize the nth plugin of this module
    void initializePlugin(int pluginIndex, void* userData);

    void onCheeseBurgerEatenEvent(int pluginIndex, void* userData);
}


// pluginimplementation.h -- plugin authors inherit from this abstract base class
#include "iplugin.h"
class PluginImplementation {
public:
    PluginImplementation();
};


// pluginimplementation.cpp -- implements C API of plugin too
#include <vector>

struct LocalPluginRegistry {
    static std::vector<PluginImplementation*> plugins;
};

PluginImplementation::PluginImplementation() {
    LocalPluginRegistry::plugins.push_back(this);
}

extern "C" {
    int getPluginCount() {
        return static_cast<int>(LocalPluginRegistry::plugins.size());
    }

    void* registerPlugin(int pluginIndex) {
        auto plugin = LocalPluginRegistry::plugins[pluginIndex];
        plugin->register();
        return (void*)plugin;
    }

    void initializePlugin(int pluginIndex, void* userData) {
        auto plugin = static_cast<PluginImplementation*>(userData);
        plugin->initialize();
    }

    void onCheeseBurgerEatenEvent(int pluginIndex, void* userData) {
        auto plugin = static_cast<PluginImplementation*>(userData);
        plugin->onCheeseBurgerEatenEvent();
    }
}


// To declare a plugin in the DLL, just make a static instance:
class SomePlugin : public PluginImplementation {
    virtual void initialize() {  }
};
SomePlugin plugin;    // Will be created when the DLL is first loaded by a process


// plugin.h -- part of the main project source only
#include "iplugin.h"
#include <string>
#include <vector>
#include <windows.h>

class PluginRegistry;

class Plugin : public IPlugin {
public:
    Plugin(PluginRegistry* registry, int index, int moduleIndex)
        : registry(registry), index(index), moduleIndex(moduleIndex)
    {
    }

    virtual void register();
    virtual void initialize();

    virtual void onCheeseBurgerEatenEvent();

private:
    PluginRegistry* registry;
    int index;
    int moduleIndex;
    void* userData;
};

class PluginRegistry {
public:
    registerPluginsInModule(std::string const& modulePath);
    ~PluginRegistry();

public:
    std::vector<Plugin*> plugins;

private:
    extern "C" {
        typedef int (*getPluginCountFunc)();
        typedef void* (*registerPluginFunc)(int);
        typedef void (*initializePluginFunc)(int, void*);
        typedef void (*onCheeseBurgerEatenEventFunc)(int, void*);
    }

    struct Module {
        getPluginCountFunc getPluginCount;
        registerPluginFunc registerPlugin;
        initializePluginFunc initializePlugin;
        onCheeseBurgerEatenEventFunc onCheeseBurgerEatenEvent;

        HMODULE handle;
    };

    friend class Plugin;
    std::vector<Module> registeredModules;
}


// plugin.cpp
void Plugin::register() {
    auto func = registry->registeredModules[moduleIndex].registerPlugin;
    userData = func(index);
}

void Plugin::initialize() {
    auto func = registry->registeredModules[moduleIndex].initializePlugin;
    func(index, userData);
}

void Plugin::onCheeseBurgerEatenEvent() {
    auto func = registry->registeredModules[moduleIndex].onCheeseBurgerEatenEvent;
    func(index, userData);
}

PluginRegistry::registerPluginsInModule(std::string const& modulePath) {
    // For Windows:
    HMODULE handle = LoadLibrary(modulePath.c_str());

    Module module;
    module.handle = handle;
    module.getPluginCount = (getPluginCountFunc)GetProcAddr(handle, "getPluginCount");
    module.registerPlugin = (registerPluginFunc)GetProcAddr(handle, "registerPlugin");
    module.initializePlugin = (initializePluginFunc)GetProcAddr(handle, "initializePlugin");
    module.onCheeseBurgerEatenEvent = (onCheeseBurgerEatenEventFunc)GetProcAddr(handle, "onCheeseBurgerEatenEvent");

    int moduleIndex = registeredModules.size();
    registeredModules.push_back(module);

    int pluginCount = module.getPluginCount();
    for (int i = 0; i < pluginCount; ++i) {
        auto plugin = new Plugin(this, i, moduleIndex);
        plugins.push_back(plugin);
    }
}

PluginRegistry::~PluginRegistry() {
    for (auto it = plugins.begin(); it != plugins.end(); ++it) {
        delete *it;
    }

    for (auto it = registeredModules.begin(); it != registeredModules.end(); ++it) {
        FreeLibrary(it->handle);
    }
}



// When discovering plugins (e.g. by loading all DLLs in a "plugins" folder):
PluginRegistry registry;
registry.registerPluginsInModule("plugins/cheeseburgerwatcher.dll");
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
    (*it)->register();
}
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
    (*it)->initialize();
}

// And then, when a cheeseburger is actually eaten:
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
    auto plugin = *it;
    plugin->onCheeseBurgerEatenEvent();
}

This has the benefit of using a C API for compatibility, but also offering a higher level of abstraction for plugins written in C++ (and for the main project code, which is C++). Note that it lets multiple plugins be defined in a single DLL. You could also eliminate some of the duplication of function names by using macros, but I chose not to for this simple example.


All of this, by the way, assumes plugins that have no interdependencies -- if plugin A affects (or is required by) plugin B, you need to devise a safe method for injecting/constructing dependencies as needed, since there's no way of guaranteeing what order the plugins will be loaded in (or initialized). A two-step process would work well in that case: Load and register all plugins; during registration of each plugin, let them register any services they provide. During initialization, construct requested services as needed by looking at the registered service table. This ensures that all services offered by all plugins are registered before any of them are attempted to be used, no matter what order plugins get registered or initialized in.

like image 59
Cameron Avatar answered Oct 30 '22 15:10

Cameron


The approach you took is sane in general, but I see a few possible improvements.

  • Your kernel should export C functions with a conventional calling convention (cdecl, or maybe stdcall if you are on Windows) for the registration of plugins and functions. If you use a C++ function then you are forcing all plugin authors to use the same compiler and compiler version that you use, since many things like C++ function name mangling, STL implementation and calling conventions are compiler specific.

  • Plugins should only export C functions like the kernel.

  • From the definition of getFunction it seems each plugin has a name, which other plugins can use to obtain its functions. This is not a safe practice, two developers can create two different plugins with the same name, so when a plugin asks for some other plugin by name it may get a different plugin than the expected one. A better solution would be for plugins to have a public GUID. This GUID can appear in each plugin's header file, so that other plugins can refer to it.

  • You have not implemented versioning. Ideally you want your kernel to be versioned because invariably you will change it in the future. When a plugin registers with the kernel it passes the version of the kernel API it was compiled against. The kernel then can decide if the plugin can be loaded. For example, if kernel version 1 receives a registration request for a plugin that requires kernel version 2 you have a problem, the best way to address that is to not allow the plugin to load since it may need kernel features that are not present in the older version. The reverse case is also possible, kernel v2 may or may not want to load plugins that were created for kernel v1, and if it does allow it it may need to adapt itself to the older API.

  • I'm not sure I like the idea of a plugin being able to locate another plugin and call its functions directly, as this breaks encapsulation. It seems better to me if plugins advertise their capabilities to the kernel, so that other plugins can find services they need by capability instead of by addressing other plugins by name or GUID.

  • Be aware that any plugin that allocates memory needs to provide a deallocation function for that memory. Each plugin could be using a different run-time library, so memory allocated by a plugin may be unknown to other plugins or the kernel. Having allocation and deallocation in the same module avoids problems.

like image 25
Miguel Avatar answered Oct 30 '22 14:10

Miguel