memory heap allocator library that keeps separate structures?

Tags:

Here's my problem: I need to manage memory in a remote contiguous buffer that my program can't read or write to. It needs to have malloc()/free() semantics, and support setting minimum alignment, and fragmentation avoidance (whenever possible). Since I can't read or write to this buffer directly, I need to use local structures to manage all the allocations.

I am already using boost, so if something inside boost can be massaged to do this, that would be great. However, I'm not averse to using a C library or anything like that.

As an example, I need a non-IPC version of:

boost::interprocess::basic_managed_external_buffer<
                     char,
                     boost::interprocess::rbtree_best_fit<
                                          boost::interprocess::mutex_family,
                                          boost::interprocess::offset_ptr<void>,
                                          SOME_ALIGNMENT>,
                     boost::interprocess::iset_index>

preferably with malloc/free semantics instead of new/delete but without it ever actually reading or writing to the underlying buffer (and keeping all the allocation information/data structures in a separate buffer)

Any ideas?

P.S. I don't want the boost::interprocess example to be misleading, I am just familiar with the interface so using it as an example. The application is not really interprocess, and the allocator would only be used from my application.

Specifically I would like to be able to manage a 16GB external buffer with allocation sizes from 128 bytes all the way to 512MB. This is strictly 64-bit code, but even then I would prefer the pointer type to be a template parameter so I can use uint64_t explicitly.

938

asked May 19 '15 18:05

Yuriy Romanenko

1 Answers

I don't know, off the top of my hat, of any canned implementation that can be used. However, this doesn't appear to be particularly difficult to implement yourself, using just the garden-variety containers in the C++ standard library.

I would recommend a simple approach that uses two std::maps and one std::multimap. Let's say that bufaddr_t is an opaque integer that represents an address in the external buffer. Since we're talking about 16 gig buffer, it has to be a 64 bit address:

typedef uint64_t memblockaddr_t;

Ditto for the size of an allocated block.

typedef uint64_t memblocksize_t;

You could, I suppose, use something else for memblockaddr_t, as long as the opaque data type has strict weak ordering.

The first part is easy. Keeping track of all allocated blocks:

std::map<memblockaddr_t, memblocksize_t> allocated;

So, when you successfully allocate a chunk of memory in the external buffer, you insert it here. When you wish to deallocate a chunk of memory, you look up the size of the allocated block here, and remove the map entry. Simple enough.

But that, of course, is not the whole story. Now, we need to keep track of available, unallocated memory blocks. Let's do it this way:

typedef std::multimap<memblocksize_t, memblockaddr_t> unallocated_t;

unallocated_t unallocated;

std::map<memblockaddr_t, unallocated_t::iterator> unallocated_lookup;

unallocated is a collection of all unallocated chunks in your external buffer, keyed by chunk size. The key is the chunk size. So, when you need to allocate a chunk of memory of a particular size, you can simply use the lower_bound() method (or upper_bound(), if you prefer) to immediately find the first chunk whose size is as least as much as you want to allocate.

And since you can, of course, have many chunks of the same size, unallocated has to be a std::multimap.

Also, unallocated_lookup is a map keyed by the address of each unallocated chunk, that gives you the iterator for this chunk's entry in unallocated. Why you need that will become clear in a moment.

So:

Initialize a new, completely unallocated buffer with a single entry:

memblockaddr_t beginning=0; // Or, whatever represents the start of the buffer.
auto p=unallocated.insert(std::make_pair(BUFFER_SIZE, beginning)).first;
unallocated_lookup.insert(std::make_pair(beginning, p));

Then:

To allocate a block, use the lower_bound()/upper_bound() approach as I mentioned above to find the first unallocated chunk that's at least as big and remove its entry from unallocated and unallocated_lookup. If it's more than what you need, return the excess back to the pool as if the extra amount that you don't need is being deallocated (step 3 below). Finally, insert it into the allocated array, so you remember how large the allocated chunk is.
To deallocate a block, look it up in the allocated array, to get its size, remove it from the allocated array, then:
Insert it into the unallocated and unallocated_lookup, similar to how the initial unallocated chunk was inserted, see above.
But you're not done yet. You must then use unallocated_lookup to trivially look up the preceding unallocated chunk and the following unallocated chunk in the memory buffer. If either or both of them are immediately adjacent to the newly-freed chunk, you must coalesce them together. That should be a very obvious process. You can simply go through the motions of officially removing the adjacent chunks individually, from unallocated and unallocated_lookup, then freeing a single, coalesced chunk.

That's the real purpose of unallocated_lookup, to be able to easily coalesce contiguous unallocated chunks.

As far as I can tell, all of the above operations carry logarithmic complexity. They're entirely based on std::map's and std::multimap's methods that have logarithmic complexity, and nothing else.

Finally:

Depending your application's behavior, you can easily tweak the implementation to internally round up the size of an allocated chunk to whatever multiple you wish. Or adjust the allocation strategy -- allocate from the smallest chunk that's big enough to meet the allocation request, or just allocate from the large unallocated chunk (simple, use end() to find it), etc...

That's one advantage to rolling your own implementation -- you'll always have much greater flexibility to tweak your own implementation, then you would typically have with some canned external library.

answered Sep 18 '22 06:09

Sam Varshavchik

Related questions
                            
                                Moving files to Trash Can in Linux using C++
                            
                                fastest integer sort implementation for 200-300 bit integers?
                            
                                shared, weak and lazy pointers in C++
                            
                                Custom Memory Allocator for STL map
                            
                                Overpainting and post-render effects on QGLView (not QGLWidget) in Qt3D
                            
                                Why does Valgrind show increasing stack usage with boost::thread?
                            
                                gSoap shared data types between interfaces
                            
                                C++ design, how to represent different stages of a protocol with objects
                            
                                OpenCV create Mat from byte array
                            
                                QMenu item text disappears when icon added
                            
                                Why does in-class initialisation of static members violate the ODR?
                            
                                Variadic template aliases as template arguments (part 2)
                            
                                Vectors and Arrays in C++
                            
                                Intel C++ compiler bug in member function overload resolution involving "using" alias?
                            
                                Will using brace-init syntax change construction behavior when an initializer_list constructor is added later?
                            
                                What algorithms and containers in the C++ 11 STL can go faster if a type is noexcept?
                            
                                Modifying Windows disk driver to use updated control codes like IOCTL_DISK_GET_DRIVE_GEOMETRY_EX
                            
                                Different behavior of c++11 list-initialization
                            
                                destructor called before temporary should be out of scope
                            
                                Why does vectorization behave differently for almost the same code?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

memory heap allocator library that keeps separate structures?

Tags:

c++

memory-management

memory

boost

Yuriy Romanenko

People also ask

1 Answers

Sam Varshavchik

Recent Activity

Donate For Us