Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::align and std::aligned_storage for aligned allocation of memory blocks

I'm trying to allocate a block of memory of size size which needs to be Alignment aligned where the size may not be defined at compile time. I know routines such as _aligned_alloc, posix_memalign, _mm_alloc, etc exist but I do not want to use them as they bring down code portability.
C++11 gives a routine std::align and also a class std::aligned_storage from which I can retrieve a POD type to allocate an element which will be aligned to my requirements. However my goal is to create an allocator which would allocate a block of memory of size size (not just a single element) which would be aligned.
Is this possible using std::align? The reason I ask is since std::align moves the pointer, the class using that pointer will give the allocator a pointer to the moved address for deallocation which would be invalid. Is there a way to create an aligned_allocator this way?

like image 404
intull Avatar asked Jun 29 '13 08:06

intull


People also ask

What is std :: Aligned_storage?

The type defined by std::aligned_storage<>::type can be used to create uninitialized memory blocks suitable to hold the objects of given type, optionally aligned stricter than their natural alignment requirement, for example on a cache or page boundary.

What is aligned Alloc?

The aligned_alloc function allocates a block of size bytes whose address is a multiple of alignment . The alignment must be a power of two and size must be a multiple of alignment . The aligned_alloc function returns a null pointer on error and sets errno to one of the following values: ENOMEM.


2 Answers

EDIT: after clarifications from the OP, it appears the original answer is off-topic; for reference's sake it is kept at the end of this answer.

Actually, the answer is rather simple: you simply need to keep a pointer both to the storage block and to the first item.

This does not, actually, requires a stateful allocator (it could be possible even in C++03, albeit with a custom std::align routine). The trick is that the allocator is not required to only ask of the system exactly enough memory to store user data. It can perfectly ask a bit more for book-keeping purposes of its own.

So, here we go creating an aligned allocator; to keep it simple I'll focus on the allocation/deallocation routines.

template <typename T>
class aligned_allocator {
    // Allocates block of memory:
    // - (opt) padding
    // - offset: ptrdiff_t
    // - T * n: T
    // - (opt) padding
public:
    typedef T* pointer;
    typedef size_t size_type;

    pointer allocate(size_type n);
    void deallocate(pointer p, size_type n);

}; // class aligned_allocator

And now the allocation routine. Lots of memory fiddling, it's the heart of the allocator after all!

template <typename T>
auto aligned_allocator<T>::allocate(size_type n) -> pointer {
    size_type const alignment = std::max(alignof(ptrdiff_t), alignof(T));
    size_type const object_size = sizeof(ptrdiff_t) + sizeof(T)*n;
    size_type const buffer_size = object_size + alignment;

    // block is correctly aligned for `ptrdiff_t` because `std::malloc` returns
    // memory correctly aligned for all built-ins types.
    void* const block = std::malloc(buffer_size);

    if (block == nullptr) { throw std::bad_alloc{}; }

    // find the start of the body by suitably aligning memory,
    // note that we reserve sufficient space for the header beforehand
    void* storage = reinterpret_cast<char*>(block) + sizeof(ptrdiff_t);
    size_t shift = buffer_size;

    void* const body = std::align(alignment, object_size, storage, shift);

    // reverse track to find where the offset field starts
    char* const offset = reinterpret_cast<char*>(body) - sizeof(ptrdiff_t);

    // store the value of the offset (ie, the result of body - block)
    *reinterpret_cast<ptrdiff_t*>(offset) = sizeof(ptrdiff_t) + shift;

    // finally return the start of the body
    return reinterpret_cast<ptrdiff_t>(body);
} // aligned_allocator<T>::allocate

Fortunately the deallocation routine is much simpler, it just has to read the offset and apply it.

template <typename T>
void aligned_allocator<T>::deallocate(pointer p, size_type) {
    // find the offset field
    char const* header = reinterpret_cast<char*>(p) - sizeof(ptrdiff_t);

    // read its value
    ptrdiff_t const offset = *reinterpret_cast<ptrdiff_t*>(header);

    // apply it to find start of block
    void* const block = reinterpret_cast<char*>(p) - offset;

    // finally deallocate
    std::free(block);
} // aligned_allocator<T>::deallocate

The other routines need not be aware of the memory layout, so writing them is trivial.


Original answer:

template <typename T>
class Block {
public:
    Block(Block const&) = delete;
    Block& operator=(Block const&) = delete;

    explicit Block(size_t n);
    ~Block();

private:
    void* _storage;
    T* _begin;
    T* _end;
}; // class Block

template <typename T>
Block<T>::Block(size_t n) {
    size_t const object_size = n * sizeof(T);
    size_t const buffer_size = object_size + alignof(T);

    _storage = std::malloc(size);

    void* stock = _storage;
    size_t shift = buffer_size;
    std::align(alignof(T), object_size, stock, shift);

    _begin = _end = reinterpret_cast<T*>(stock);
} // Block<T>::Block

template <typename T>
Block<T>::~Block() {
    for (; _end != _begin; --_end) {
        (_end - 1)->~T();
    }

    std::free(_storage);
} // Block<T>::~Block
like image 85
Matthieu M. Avatar answered Oct 05 '22 01:10

Matthieu M.


If it HAS TO BE a C++11 solution, then ignore this answer.

If not... I don't know if you already know this, but here is one option:

void * aligned_malloc( size_t size, size_t alignement )
{
    void * p = malloc( size + --alignement );
    void * p1 = (void*)( ( (size_t)p + alignement ) & ~alignement );

    ((char*)p1)[ -1 ] = (char)((char*)p1 - (char*)p);

    return p1;
}

void aligned_free( void * pMem )
{
    char * pDelete = (char*)pMem - ((char*)pMem)[ -1 ];
    free( pDelete );
}

Perhaps malloc and free are not 100% portable, but it's easy to handle such cases with preprocessor directives.

like image 32
user1764961 Avatar answered Oct 05 '22 00:10

user1764961