Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solutions for dynamic dispatch on unrelated types

I'm investigating possible implementations of dynamic dispatch of unrelated types in modern C++ (C++11/C++14).

By "dynamic dispatch of types" I mean a case when in runtime we need to choose a type from list by its integral index and do something with it (call a static method, use a type trait and so on).

For example, consider stream of serialized data: there are several kinds of data values, which are serialized/deserialized differently; there are several codecs, which do serialization/deserialization; and our code read type marker from stream and then decide which codec it should invoke to read full value.

I'm interested in a case where are many operations, which could be invoked on types (several static methods, type traits...), and where could be different mapping from logical types to C++ classes and not only 1:1 (in example with serialization it means that there could be several data kinds all serialized by the same codec).

I also wish to avoid manual code repetition and to make the code more easily maintainable and less error-prone. Performance also is very important.

Currently I'm seeing those possible implementations, am I missing something? Can this be done better?

  1. Manually write as many functions with switch-case as there are possible operations invocations on types.

    size_t serialize(const Any & any, char * data)
    {
        switch (any.type) {
            case Any::Type::INTEGER:
                return IntegerCodec::serialize(any.value, data);
            ...
        }
    }
    Any deserialize(const char * data, size_t size)
    {
        Any::Type type = deserialize_type(data, size);
        switch (type) {
            case Any::Type::INTEGER:
                return IntegerCodec::deserialize(data, size);
            ...
        }
    }
    bool is_trivially_serializable(const Any & any)
    {
        switch (any.type) {
            case Any::Type::INTEGER:
                return traits::is_trivially_serializable<IntegerCodec>::value;
            ...
        }
    }
    

Pros: it's simple and understandable; compiler could inline dispatched methods.

Cons: it requires a lot of manual repetition (or code generation by external tool).

  1. Create dispatching table like this

    class AnyDispatcher
    {
    public:
        virtual size_t serialize(const Any & any, char * data) const = 0;
        virtual Any deserialize(const char * data, size_t size) const = 0;
        virtual bool is_trivially_serializable() const = 0;
        ...
    };
    class AnyIntegerDispatcher: public AnyDispatcher
    {
    public:
        size_t serialize(const Any & any, char * data) const override
        {
            return IntegerCodec::serialize(any, data);
        }
        Any deserialize(const char * data, size_t size) const override
        {
            return IntegerCodec::deserialize(data, size);
        }
        bool is_trivially_serializable() const
        {
            return traits::is_trivially_serializable<IntegerCodec>::value;
        }
        ...
    };
    ...
    
    // global constant
    std::array<AnyDispatcher *, N> dispatch_table = { new AnyIntegerDispatcher(), ... };
    
    size_t serialize(const Any & any, char * data)
    {
        return dispatch_table[any.type]->serialize(any, data);
    }
    Any deserialize(const char * data, size_t size)
    {
        return dispatch_table[any.type]->deserialize(data, size);
    }
    bool is_trivially_serializable(const Any & any)
    {
        return dispatch_table[any.type]->is_trivially_serializable();
    }
    

Pros: it's a little more flexible - one needs to write a dispatcher class for each dispatched type, but then one could combine them in different dispatch tables.

Cons: it requires writing a lot of dispatching code. And there is some overhead due to virtual dispatching and impossibility to inline codec's methods into caller's site.

  1. Use templated dispatching function

    template <typename F, typename... Args>
    auto dispatch(Any::Type type, F f, Args && ...args)
    {
        switch (type) {
            case Any::Type::INTEGER:
                return f(IntegerCodec(), std::forward<Args>(args)...);
            ...
        }
    }
    
    size_t serialize(const Any & any, char * data)
    {
        return dispatch(
                    any.type,
                    [] (const auto codec, const Any & any, char * data) {
                        return std::decay_t<decltype(codec)>::serialize(any, data);
                    },
                    any,
                    data
                );
    }
    bool is_trivially_serializable(const Any & any)
    {
        return dispatch(
                    any.type,
                    [] (const auto codec) {
                        return traits::is_trivially_serializable<std::decay_t<decltype(codec)>>::value;
                    }
                );
    }
    

Pros: it requires just one switch-case dispatching function and a little of code in each operation invocation (at least manually written). And compiler may inline what it finds apropriate.

Cons: it's more complicated, requires C++14 (to be such clean and compact) and relies on compiler ability to optimize away unused codec instance (which is used only to choose right overload for codec).

  1. When for one set of logical types there may be several mapping to implementation classes (codecs in this example), it may be better to generalize solution #3 and write completely generic dispatch function, which receive compile-time mapping between type values and invoked types. Something like this:

    template <typename Mapping, typename F, typename... Args>
    auto dispatch(Any::Type type, F f, Args && ...args)
    {
        switch (type) {
            case Any::Type::INTEGER:
                return f(mpl::map_find<Mapping, Any::Type::INTEGER>(), std::forward<Args>(args)...);
            ...
        }
    }
    

I'm leaning on solution #3 (or #4). But I do wonder - is it possible to avoid manually writing of dispatch function? Its switch-case I mean. This switch-case is completely derived from compile-time mapping between type values and types - is there any method to handle its generation to compiler?

like image 939
Alexander Morozov Avatar asked Oct 07 '16 11:10

Alexander Morozov


1 Answers

Tag dispatching, where you pass a type to pick an overload, is efficient. std libraries typically use it for algorithms on iterators, so different iterator categories get different implementations.

When I have a list of type ids, I ensure they are contiguous and write a jump table.

This is an array of pointers to functions that do the task at hand.

You can automate writing this in C++11 or better; I call it the magic switch, as it acts like a runtime switch, and it calls a function with a compile time value based off the runtime one. I make the functions with lambdas, and expand a parameter pack inside them so their bodies differ. They then dispatch to the passed-in function object.

Write that, then you can move your serialization/deserialization code into "type safe" code. Use traits to map from compile-time indexes to type tags, and/or dispatch based on the index to an overloaded function.

Here is a C++14 magic switch:

template<std::size_t I>using index=std::integral_constant<std::size_t, I>;

template<class F, std::size_t...Is>
auto magic_switch( std::size_t I, F&& f, std::index_sequence<Is...> ) {
  auto* pf = std::addressof(f);
  using PF = decltype(pf);
  using R = decltype( (*pf)( index<0>{} ) );
  using table_entry = R(*)( PF );

  static const table_entry table[] = {
    [](PF pf)->R {
      return (*pf)( index<Is>{} );
    }...
  };

  return table[I](pf);
}    

template<std::size_t N, class F>
auto magic_switch( std::size_t I, F&& f ) {
  return magic_switch( I, std::forward<F>(f), std::make_index_sequence<N>{} );
}

use looks like:

std::size_t r = magic_switch<100>( argc, [](auto I){
  return sizeof( char[I+1] ); // I is a compile-time size_t equal to argc
});
std::cout << r << "\n";

live example.

If you can register your type enum to type map at compile time (via type traits or whatever), you can round trip through a magic switch to turn your runtime enum value into a compile time type tag.

template<class T> struct tag_t {using type=T;};

then you can write your serialize/deserialize like this:

template<class T>
void serialize( serialize_target t, void const* pdata, tag_t<T> ) {
  serialize( t, static_cast<T const*>(pdata) );
}
template<class T>
void deserialize( deserialize_source s, void* pdata, tag_t<T> ) {
  deserialize( s, static_cast<T*>(pdata) );
}

If we have an enum DataType, we write a traits:

enum DataType {
  Integer,
  Real,
  VectorOfData,
  DataTypeCount, // last
};

template<DataType> struct enum_to_type {};

template<DataType::Integer> struct enum_to_type:tag_t<int> {};
// etc

void serialize( serialize_target t, Any const& any ) {
  magic_switch<DataType::DataTypeCount>(
    any.type_index,
    [&](auto type_index) {
      serialize( t, any.pdata, enum_to_type<type_index>{} );
    }
  };
}

all the heavy lifting is now done by enum_to_type traits class specializations, the DataType enum, and overloads of the form:

void serialize( serialize_target t, int const* pdata );

which are type safe.

Note that your any is not actually an any, but rather a variant. It contains a bounded list of types, not anything.

This magic_switch ends up being used to reimplement std::visit function, which also gives you type-safe access to the type stored within the variant.

If you want it to contain anything, you have to determine what operations you want to support, write type-erasure code for it that runs when you store it in the any, store the type-erased operations along side the data, and bob is your uncle.

like image 75
Yakk - Adam Nevraumont Avatar answered Oct 02 '22 16:10

Yakk - Adam Nevraumont