Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I polymorphically store and access different types from the same inheritance hierarchy in contiguous memory?

For polymorphism, the usual approach is to use std::vector<base*>. However, I have to provide the addresses myself, that is to manage the memory myself whether I use std::unique_ptr<> or raw pointers.

I would like to have a polymorphic_storage<base> type that accepts any type that inherits from base. I also want the types to be stored in contiguous memory for faster traversal and for cache related concerns.

However, there's a pretty big issue: In the absence of type information at the storage level, the correct move/copy operations must be called on resize.

Feature request:

  • Any type that inherits from the base class can be added to the storage; no fixed inheritance hierarchies.
  • The inheriting types must be correctly aligned inside of the storage type.
  • The correct move and copy operations must be called since I am not dealing with POD types.

What mechanism can I use to achieve this?


While I provide an answer, I would welcome anyone to post their solution.

like image 897
user2296177 Avatar asked Sep 07 '16 17:09

user2296177


2 Answers

Now with alignment support.

Demo: http://coliru.stacked-crooked.com/a/c304d2b6a475d70c


This answer focuses on solving the three features requested in the question.

  • No static memory is used because it will lead to code breaking changes if a new type is added to the inheritance hierarchy and that new type exceeds the static limit.
  • All types inside the storage are properly aligned.
  • The correct move/copy constructors are called when reallocation occurs.

It compiles with fstrict-aliasing, so don't be too afraid of that reinterpret_cast<>() usage.


The handle_base type has a void* data member called src_, which points to some value. It has two member functions that act on src_.

void transfer( void* dst, std::size_t& out_size )

Uses placement-new to either move or copy construct the value pointed to by src_ at dst, then sets src_ to dst. It also adds the size, in bytes, taken by the type to the out_size reference argument; this is useful for properly aligning types.

void* src()

Returns the pointer src_.

handle_base.h

namespace gut
{
    template<class T> class handle;

    class handle_base
    {
    public:
        virtual ~handle_base() = default;

        handle_base() = default;
        handle_base( handle_base&& ) = default;
        handle_base( handle_base const& ) = default;
        handle_base& operator=( handle_base&& ) = default;
        handle_base& operator=( handle_base const& ) = default;

        void* src() const noexcept
        {
            return src_;
        }

        virtual void transfer( void* dst, std::size_t& out_size ) = 0;
        virtual void destroy() = 0;

    protected:
        handle_base( void* src ) noexcept
            : src_{ src }
        {}

        void* src_;
    };
}

Next, I create the handle<T> type which inherits from handle_base in order to provide the correct move/copy operations. Type information is available at this level; this enables everything from proper alignment to proper move/copy operations.

void transfer( void* dst, std::size_t& out_size )

The function will take care of selecting whether to use the move or copy constructor. The move constructor will always be chosen if it is available. It calculates any required padding for alignment, transfers the value at src_ to dst + padding and increments the out_size reference argument by its size and padding.

handle.h

namespace gut
{
    template<class T>
    static std::size_t calculate_padding( void* p ) noexcept
    {
        std::size_t r{ reinterpret_cast<std::uintptr_t>( p ) % alignof( T ) };
        return r == 0 ? 0 : alignof( T ) - r;
    }

    template <class T>
    class handle final : public handle_base
    {
    public:
        using byte = unsigned char;

        static_assert( sizeof( void* ) == sizeof( T* ),
            "incompatible pointer sizes" );

        static constexpr std::integral_constant
        <
            bool, std::is_move_constructible<T>::value
        > is_moveable{};

        handle( T* src ) noexcept
            : handle_base( src )
        {}

        handle( handle&& ) = default;
        handle( handle const& ) = default;
        handle& operator=( handle&& ) = default;
        handle& operator=( handle const& ) = default;

        void transfer( std::true_type, void* dst )
        noexcept( std::is_nothrow_move_assignable<T>::value )
        {
            src_ = ::new ( dst ) T{ std::move( *reinterpret_cast<T*>( src_ ) ) };
        }

        void transfer( std::false_type, void* dst )
        noexcept( std::is_nothrow_copy_assignable<T>::value )
        {
            src_ = ::new ( dst ) T{ *reinterpret_cast<T*>( src_ ) };
        }

        virtual void transfer( void* dst, std::size_t& out_size )
        noexcept( noexcept(
            std::declval<handle>().transfer( is_moveable, dst ) ) ) override
        {
            std::size_t padding{ gut::calculate_padding<T>( dst ) };
            transfer( is_moveable, reinterpret_cast<byte*>( dst ) + padding );
            out_size += sizeof( T ) + padding;
        }

        virtual void destroy()
        noexcept( std::is_nothrow_destructible<T>::value )
        {
            reinterpret_cast<T*>( src_ )->~T();
            src_ = nullptr;
        }
    };
}

Since I know for a fact that sizeof( handle_base ) == sizeof( handle<T> ) for any T, I create a polymorphic_handle type as an extra indirection for ease of use. This type can hold any handle<T> and overloads operator->() so that it can act as a generic handle to any handle.

polymorphic_handle.h

namespace gut
{
    class polymorphic_handle
    {
    public:
        using value_type = gut::handle_base;
        using pointer = value_type*;
        using const_pointer = value_type const*;

        template<class T>
        polymorphic_handle( gut::handle<T> h ) noexcept
        {
            ::new ( &h_ ) gut::handle<T>{ h };
        }

        pointer operator->()
        {
            return reinterpret_cast<pointer>( &h_ );
        }

        const_pointer operator->() const
        {
            return reinterpret_cast<const_pointer>( &h_ );
        }

    private:
        std::aligned_storage_t<sizeof( value_type ), alignof( value_type )> h_;
    };
}

Now that all the building blocks are present, the polymorphic_storage<T> type can be defined. It simply stores a std::vector<gut::polymorphic_handle>, a buffer and size information.

The storage type ensures that only classes that derive from its template argument type can be added. It can only be created with an initial instance or some initial capacity (in bytes).

template<class D> void ensure_capacity()

This function does almost all of the work. It ensures that there is enough capacity for the type specified as a template argument and transfers all data to a new buffer on reallocation. It also updates the size_ member function to the next construction location.

void emplace_back( D&& value )

This will emplace value into the polymorphic_storage<B> and create a handle to the newly emplaced value.

namespace gut
{
    template<class B>
    class polymorphic_storage
    {
    public:
        using byte = unsigned char;
        using size_type = std::size_t;

        ~polymorphic_storage() noexcept
        {
            for ( auto& h : handles_ )
            {
                h->destroy();
            }
            std::free( data_ );
        }

        explicit polymorphic_storage( size_type const initial_capacity )
        {
            byte* new_data
            {
                reinterpret_cast<byte*>( std::malloc( initial_capacity ) )
            };

            if ( new_data )
            {
                data_ = new_data;
                size_ = 0;
                capacity_ = initial_capacity;
            }
            else
            {
                throw std::bad_alloc{};
            }
        }

        template
        <
            class D,
            std::enable_if_t<std::is_base_of<B, std::decay_t<D>>::value, int> = 0
        >
        explicit polymorphic_storage( D&& value )
            : data_{ nullptr }
            , size_{ 0 }
            , capacity_{ 0 }
        {
            using der_t = std::decay_t<D>;
            
            byte* new_data{ reinterpret_cast<byte*>(
                std::malloc( sizeof( der_t ) + alignof( der_t ) ) ) };
            
            if ( new_data )
            {
                data_ = new_data;
                size_ = sizeof( der_t );
                capacity_ = sizeof( der_t ) + alignof( der_t );
                handles_.emplace_back( gut::handle<der_t>
                {
                    ::new ( data_ ) der_t{ std::forward<D>( value ) }
                } );
            }
            else
            {
                throw std::bad_alloc{};
            }
        }

        template
        <
            class D,
            std::enable_if_t<std::is_base_of<B, std::decay_t<D>>::value, int> = 0
        >
        void emplace_back( D&& value )
        {
            using der_t = std::decay_t<D>;

            ensure_capacity<der_t>();
            der_t* p{ ::new ( data_ + size_ ) der_t{ std::forward<D>( value ) } };
            size_ += sizeof( der_t );
            handles_.emplace_back( gut::handle<der_t>{ p } );
        }

        template
        <
            class D,
            std::enable_if_t<std::is_base_of<B, std::decay_t<D>>::value, int> = 0
        >
        void ensure_capacity()
        {
            using der_t = std::decay_t<D>;

            auto padding = gut::calculate_padding<der_t>( data_ + size_ );
            if ( capacity_ - size_ < sizeof( der_t ) + padding )
            {
                auto new_capacity =
                    ( sizeof( der_t ) + alignof( der_t ) + capacity_ ) * 2;

                auto new_data = reinterpret_cast<byte*>(
                    std::malloc( new_capacity ) );

                if ( new_data )
                {
                    size_ = 0;
                    capacity_ = new_capacity;
                    for ( auto& h : handles_ )
                    {
                        h->transfer( new_data + size_, size_ );
                    }
                    std::free( data_ );
                    data_ = new_data;
                }
                else
                {
                    throw std::bad_alloc{};
                }
            }
            else
            {
                size_ += padding;
            }
        }

    public:
        std::vector<gut::polymorphic_handle> handles_;
        byte* data_;
        size_type size_;
        size_type capacity_;
    };
}

Here is an example of the storage in use. Note that types der0, der1 and der2 inherit from base and have different sizes and alignments.

Demo: http://coliru.stacked-crooked.com/a/c304d2b6a475d70c

#include <iostream>
#include <string>

struct base
{
    virtual ~base() = default;
    virtual void print() const = 0;
};

struct der0 : public base
{
    der0( int&& i ) noexcept : i_{ i } {}
    void print() const override { std::cout << "der0_" << i_ << '\n'; }
    int i_;
};

struct der1 : public base
{
    der1( std::string const& s ) noexcept : s_{ s } {}
    void print() const override { std::cout << "der1_" << s_ << '\n'; }
    std::string s_;
};

struct der2 : public base
{
    der2( std::string&& s ) noexcept : s_{ std::move( s ) } {}
    void print() const override { std::cout << "der2_" << s_ << '\n'; }
    std::string s_;
    double d[ 22 ];
};

int main()
{
    gut::polymorphic_storage<base> ps{ 32 };
    ps.emplace_back( der1{ "aa" } );
    ps.emplace_back( der2{ "bb" } );
    ps.emplace_back( der1{ "cc" } );
    ps.emplace_back( der2{ "ee" } );
    ps.emplace_back( der0{ 13 } );
    ps.emplace_back( der2{ "ff" } );

    for ( auto handle : ps.handles_ )
        reinterpret_cast<base*>( handle->src() )->print();
}
like image 191
user2296177 Avatar answered Nov 09 '22 23:11

user2296177


This approach tried to avoid creating virtual objects in a buffer. Instead, it creates manual vtables, and extends them.

The first thing we start with is a value vtable that lets us deal with a value virtually:

struct value_vtable {
  void(* copy_ctor)(void* dest, void const* src) = nullptr;
  void(* move_ctor)(void* dest, void* src) = nullptr;
  void(* dtor)(void* delete_this) = nullptr;
};

Creating one of these for a type T looks like this:

template<class T>
value_vtable make_value_vtable() {
  return {
    [](void* dest, void const* src) { // copy
      new(dest) T( *(T const*)src );
    },
    [](void* dest, void * src) { // move
      new(dest) T( std::move(*(T*)src) );
    },
    [](void* delete_this) { // dtor
      ((T*)delete_this)->~T();
    },
  };
}

We can store these vtables inline, or we can create static storage for them:

template<class T>
value_vtable const* get_value_vtable() {
  static auto const table = make_value_vtable<T>();
  return &table;
}

At this point we haven't stored anything.

Here is a value_storage. It can store any value (can be copied and moved and destroyed):

template<std::size_t S, std::size_t A>
struct value_storage {
  value_vtable const* vtable;
  std::aligned_storage_t<S, A> data;
  template<class T,
    std::enable_if_t<!std::is_same< std::decay_t<T>, value_storage >{}, int> =0,
    std::enable_if_t< ( sizeof(T)<=S && alignof(T)<=A ), int > = 0
  >
  value_storage( T&& tin ) {
    new ((void*)&data) std::decay_t<T>( std::forward<T>(tin) );
    vtable = get_value_vtable<std::decay_t<T>>();
  }
  // to permit overriding the vtable:
protected:
  template<class T>
  value_storage( value_vtable const* vt, T&& t ):
    value_storage( std::forward<T>(t) )
  {
    vtable = vt;
  }
public:
  void move_from( value_storage&& rhs ) {
    clear();
    if (!rhs.vtable) return;
    rhs.vtable->move_ctor( &data, &rhs.data );
    vtable = rhs.vtable;
  }
  void copy_from( value_storage const& rhs ) {
    clear();
    if (!rhs.vtable) return;
    rhs.vtable->copy_ctor( &data, &rhs.data );
    vtable = rhs.vtable;
  }
  value_storage( value_storage const& rhs ) {
    copy_from(rhs);
  }
  value_storage( value_storage && rhs ) {
    move_from(std::move(rhs));
  }
  value_storage& operator=( value_storage const& rhs ) {
    copy_from(rhs);
    return *this;
  }
  value_storage& operator=( value_storage && rhs ) {
    move_from(std::move(rhs));
    return *this;
  }
  template<class T>
  T* get() { return (T*)&data; }
  template<class T>
  T const* get() const { return (T*)&data; }
  explicit operator bool() const { return vtable; }
  void clear() {
    if (!vtable) return;
    vtable->dtor( &data );
    vtable = nullptr;
  }
  value_storage() = default;
  ~value_storage() { clear(); }
};

This type stores something that acts like a value, maybe, up to size S and align A. It does not store what type it stores, that is someone else's job. It does store how to copy, move and destroy anything it stores, but it doesn't know what the thing is it is storing.

It assumes objects constructed in a block are constructed at the front of it. You can add a void* ptr field if you do not want to make that assumption.

Now we can augment this value_storage with operations.

In particular, we want cast-to-base.

template<class Base>
struct based_value_vtable:value_vtable {
  Base*(* to_base)(void* data) = nullptr;
};

template<class Base, class T>
based_value_vtable<Base> make_based_value_vtable() {
  based_value_vtable<Base> r;
  (value_vtable&)(r) = make_value_vtable<T>();
  r.to_base = [](void* data)->Base* {
    return (T*)data;
  };
  return r;
}

template<class Base, class T>
based_value_vtable<Base> const* get_based_value_vtable() {
  static const auto vtable = make_based_value_vtable<Base, T>();
  return &vtable;
}

Now we have extended the value_vtable to include a family of "vtable with base".

template<class Base, std::size_t S, std::size_t A>
struct based_value:value_storage<S, A> {
  template<class T,
    std::enable_if_t< !std::is_same< std::decay_t<T>, based_value >{}, int> = 0
  >
  based_value( T&& tin ):
    value_storage<S, A>(
      get_based_value_vtable<Base, std::decay_t<T> >(),
      std::forward<T>(tin)
    )
  {}

  template<class T>
  based_value(
    based_value_vtable<Base> const* vt,
    T&& tin
  ) : value_storage<S, A>( vt, std::forward<T>(tin) )
  {}

  based_value() = default;
  based_value( based_value const& ) = default;
  based_value( based_value && ) = default;
  based_value& operator=( based_value const& ) = default;
  based_value& operator=( based_value && ) = default;
  based_value_vtable<Base> const* get_vt() const {
    return static_cast< based_value_vtable<Base>* >(this->vtable);
  }
  Base* get() {
    if (!*this) return nullptr;
    return get_vt()->to_base( &this->data );
  }
  Base const* get() const {
    if (!*this) return nullptr;
    return get_vt()->to_base( (void*)&this->data );
  }
};

These are regular value types that store locally, are polymorphic, and can be anything descended from Base that match the size and alignment requirements.

Simply store a vector of these. And that solves your problem. These objects satisfy the axioms of value types that std::vector expects.

live example. Code not heavily tested (you can see the very small test), it probably still contains some typos. But the design is solid, I have done this before.

Augmenting with operator* and operator-> is an excercise left to the reader. If you want extreme performance at the cost of some size, you can store the vtable function pointers inline in the class instead of in shared memory.


If you find yourself doing this more than once or adding new capabilities to your based_value, a better extension than the bespoke based_value trick would be to automate the vtable extension procedure. I'd use something like type erasure with std::any, simply replacing the any with the value_storage<Size, Align> for guaranteed automatic storage, adding in better const support, and integrating the two vtables into one (like in based_value above).

In the end, we'd get:

template<class T>
auto to_base = [](auto&& self)->copy_const< decltype(self), T& > {
  return decltype(self)(self);
};

template<class Base, std::size_t S, std::size_t A>
using based_value = super_value_storage< S, A, decltype(to_base<Base>) >;

using my_type = based_value< Some_Base, 100, 32 >;

my_type bob = // some expression
if (bob)
  return (bob->*to_base<Some_Base>)()
else
  return nullptr;

or somesuch.

All of the C-style casts can be replaced with a combination of static and const casts, but I got lazy. I don't believe I ever do anything requiring a reinterpret cast.

But really, once you have this kind of magic value-based polymorphism, why bother with a base at all? Simply erase all of the operations you want on the value, and accept anything that supports the erased operations.

With careful use of ADL in your any_methods, you can now concept-map distinct objects over to your set of concepts that must be supported. If you know how to chicken a vector of dogs, you can directly store a vector of dogs in a super_value_storage< Size, Align, ..., decltype(do_the_chicken) >.

That might be going too far, however.

like image 42
Yakk - Adam Nevraumont Avatar answered Nov 09 '22 23:11

Yakk - Adam Nevraumont