Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Branchless memory manager?

Anyone thought about how to write a memory manager (in C++) that is completely branch free? I've written a pool, a stack, a queue, and a linked list (allocating from the pool), but I am wondering how plausible it is to write a branch free general memory manager.

This is all to help make a really reusable framework for doing solid concurrent, in-order CPU, and cache friendly development.

Edit: by branchless I mean without doing direct or indirect function calls, and without using ifs. I've been thinking that I can probably implement something that first changes the requested size to zero for false calls, but haven't really got much more than that. I feel that it's not impossible, but the other aspect of this exercise is then profiling it on said "unfriendly" processors to see if it's worth trying as hard as this to avoid branching.

like image 240
Richard Fabian Avatar asked Mar 22 '10 10:03

Richard Fabian


1 Answers

While I don't think this is a good idea, one solution would be to have pre-allocated buckets of various log2 sizes, stupid pseudocode:

class Allocator {

    void* malloc(size_t size) {
        int bucket = log2(size + sizeof(int));
        int* pointer = reinterpret_cast<int*>(m_buckets[bucket].back());
        m_buckets[bucket].pop_back();
        *pointer = bucket; //Store which bucket this was allocated from
        return pointer + 1; //Dont overwrite header
    }

    void free(void* pointer) {
        int* temp = reinterpret_cast<int*>(pointer) - 1;
        m_buckets[*temp].push_back(temp);
    }

    vector< vector<void*> > m_buckets;
};

(You would of course also replace the std::vector with a simple array + counter).

EDIT: In order to make this robust (i.e. handle the situation where the bucket is empty) you would have to add some form of branching.

EDIT2: Here's a small branchless log2 function:

//returns the smallest x such that value <= (1 << x)
int
log2(int value) {
    union Foo {
        int x;
        float y;
    } foo;
    foo.y = value - 1;
    return ((foo.x & (0xFF << 23)) >> 23) - 126; //Extract exponent (base 2) of floating point number
}

This gives the correct result for allocations < 33554432 bytes. If you need larger allocations you'll have to switch to doubles.

Here's a link to how floating point numbers are represented in memory.

like image 199
Andreas Brinck Avatar answered Oct 07 '22 14:10

Andreas Brinck