Portable emulation of flexible array member in C++?

Question

I'm writing a skip list.

What I have:

template<typename T>
struct SkipListNode
{
    T data;
    SkipListNode* next[32];
};

The problem with this code is that it wastes space - it requires all nodes to contain 32 pointers. Especially considering that in typical list, half of the nodes will only need one pointer.

The C language has a neat feature called flexible array member that could solve that problem. Had it existed in C++ (even for trivial classes), I could write code like this:

template<typename T>
struct SkipListNode
{
    alignas(T) char buffer[sizeof(T)];
    SkipListNode* next[];
};

and then manually create nodes with a factory function and destroying them when deleting elements.

Which brings the question - how can I emulate such functionality portably, without undefined behaviour in C++?

I considered mallocing the buffer and then manipulating the offsets appropriately by hand - but it's too easy to violate the alignment requirements - if you malloc(sizeof(char) + sizeof(void*)*5), the pointers are unaligned. Also, I'm not even sure if such hand-created buffers are portable to C++.

Note that I don't require the exact syntax, or even ease of use - this is a node class, internal to the skip list class, which won't be a part of the interface at all.

milleniumbug · Accepted Answer

This is the implementation I wrote, based on R. Martinho Fernandes's idea - it constructs a buffer that happens to have a correct size and alignment in specific places (the AlignmentExtractor is used extract the offset of the pointer array, which ensures that the pointers in the buffer have correct alignment). Then, placement-new is used to construct the type in the buffer.

T isn't used directly in AlignmentExtractor because offsetof requires standard layout type.

#include <cstdlib>
#include <cstddef>
#include <utility>

template<typename T>
struct ErasedNodePointer
{
    void* ptr;
};

void* allocate(std::size_t size)
{
    return ::operator new(size);
}

void deallocate(void* ptr)
{
    return ::operator delete(ptr);
}

template<typename T>
struct AlignmentExtractor
{
    static_assert(alignof(T) <= alignof(std::max_align_t), "extended alignment types not supported");
    alignas(T) char data[sizeof(T)];
    ErasedNodePointer<T> next[1];
};

template<typename T>
T& get_data(ErasedNodePointer<T> node)
{
    return *reinterpret_cast<T*>(node.ptr);
}

template<typename T>
void destroy_node(ErasedNodePointer<T> node)
{
    get_data(node).~T();
    deallocate(node.ptr);
}

template<typename T>
ErasedNodePointer<T>& get_pointer(ErasedNodePointer<T> node, int pos)
{
    auto next = reinterpret_cast<ErasedNodePointer<T>*>(reinterpret_cast<char*>(node.ptr) + offsetof(AlignmentExtractor<T>, next));
    next += pos;
    return *next;
}

template<typename T, typename... Args>
ErasedNodePointer<T> create_node(std::size_t height, Args&& ...args)
{
    ErasedNodePointer<T> p = { nullptr };
    try
    {
        p.ptr = allocate(sizeof(AlignmentExtractor<T>) + sizeof(ErasedNodePointer<T>)*(height-1));
        ::new (p.ptr) T(std::forward<T>(args)...);
        for(std::size_t i = 0; i < height; ++i)
            get_pointer(p, i).ptr = nullptr;
        return p;
    }
    catch(...)
    {
        deallocate(p.ptr);
        throw;
    }
}

#include <iostream>
#include <string>

int main()
{
    auto p = create_node<std::string>(5, "Hello world");
    auto q = create_node<std::string>(2, "A");
    auto r = create_node<std::string>(2, "B");
    auto s = create_node<std::string>(1, "C");

    get_pointer(p, 0) = q;
    get_pointer(p, 1) = r;
    get_pointer(r, 0) = s;

    std::cout << get_data(p) << "
";
    std::cout << get_data(get_pointer(p, 0)) << "
";
    std::cout << get_data(get_pointer(p, 1)) << "
";
    std::cout << get_data(get_pointer(get_pointer(p, 1), 0)) << "
";

    destroy_node(s);
    destroy_node(r);
    destroy_node(q);
    destroy_node(p);
}

Output:

Hello world
A
B
C

Longer explanation:

The point of this code is to create a node dynamically, without using types directly (type erasure). This node stores an object, and N pointers, with N variable at runtime.

You can use any memory as if it had a specific type, provided that:

size is correct
alignment is correct
(only non-triviably constructible types) you manually call the constructor before using
(only non-triviably destructible types) you manually call the destructor after using

In fact, you rely on this every time you call malloc:

// 1. Allocating a block
int* p = (int*)malloc(5 * sizeof *p);
p[2] = 42;
free(p);

Here, we treat the chunk of memory returned by malloc as if it was an array of ints. This must work because of these guarantees:

malloc returns a pointer guaranteed to be properly aligned for any object type.
If your pointer p points to aligned memory, (int*)((char*)p + sizeof(int)) (or p + 1, which is equivalent) also does.

The dynamically created node must have enough size to contain N ErasedNodePointers (which are used as handles here) and one object of size T. This is satisfied by allocating enough memory in create_node function - it will allocate sizeof(T) + sizeof(ErasedNodePointer<T>)*N bytes or more, but not less.

That was the first step. The second is now we extract the required position relative to the beginning of a block. That's where AlignmentExtractor<T> comes in.

AlignmentExtractor<T> is a dummy struct I use to ensure correct alignment:

// 2. Finding position
AlignmentExtractor<T>* p = (AlignmentExtractor<T>*)malloc(sizeof *p);
p->next[0].ptr = nullptr;
// or
void* q = (char*)p + offsetof(AlignmentExtractor<T>, next);
(ErasedTypePointer<T>*)q->ptr = nullptr;

It doesn't matter how I got the position of the pointer, as long as I obey the rules of pointer arithmetic.

The assumptions here are:

I can cast any pointer to void* and back.
I can cast any pointer to char* and back.
I can operate on a struct as if it was a char array of size equal to the size of the struct.
I can use pointer arithmetic to point at any element of an array.

These all are guaranteed by C++ standard.

Now, after I have allocated the block of enough size, I calculate the offset with offsetof(AlignmentExtractor<T>, next) and add it to the pointer pointing to the block. We "pretend" (the same way the code "1. Allocating a block" pretends it has an array of ints) the result pointer points to beginning of the array. This pointer is aligned correctly, because otherwise the code "2. Finding position" couldn't access the next array due to misaligned access.

If you have a struct of standard layout type, the pointer to the struct has the same address as the first member of the struct. AlignmentExtractor<T> is standard layout.

That's not all though - requirements 1. and 2. are satisfied, but we need to satisfy requirements 3. and 4. - the data in the node doesn't have to be trivially constructible or destructible. That's why we use placement-new to construct the data - the create_node uses variadic templates and perfect forwarding to forward arguments to the constructor. And the data is destroyed in the destroy_node function by calling the destructor.

Portable emulation of flexible array member in C++?

Tags:

c++

c++11

milleniumbug

1 Answers

milleniumbug

Recent Activity

Donate For Us

Portable emulation of flexible array member in C++?

Tags:

c++

c++11

milleniumbug

1 Answers

milleniumbug

Related questions

Recent Activity

Donate For Us