Here's my problem: I need to manage memory in a remote contiguous buffer that my program can't read or write to. It needs to have malloc()/free() semantics, and support setting minimum alignment, and fragmentation avoidance (whenever possible). Since I can't read or write to this buffer directly, I need to use local structures to manage all the allocations.
I am already using boost, so if something inside boost can be massaged to do this, that would be great. However, I'm not averse to using a C library or anything like that.
As an example, I need a non-IPC version of:
boost::interprocess::basic_managed_external_buffer<
char,
boost::interprocess::rbtree_best_fit<
boost::interprocess::mutex_family,
boost::interprocess::offset_ptr<void>,
SOME_ALIGNMENT>,
boost::interprocess::iset_index>
preferably with malloc/free semantics instead of new/delete but without it ever actually reading or writing to the underlying buffer (and keeping all the allocation information/data structures in a separate buffer)
Any ideas?
P.S. I don't want the boost::interprocess example to be misleading, I am just familiar with the interface so using it as an example. The application is not really interprocess, and the allocator would only be used from my application.
Specifically I would like to be able to manage a 16GB external buffer with allocation sizes from 128 bytes all the way to 512MB. This is strictly 64-bit code, but even then I would prefer the pointer type to be a template parameter so I can use uint64_t explicitly.
Heap allocation is the most flexible allocation scheme. Allocation and deallocation of memory can be done at any time and any place depending upon the user's requirement. Heap allocation is used to allocate memory to the variables dynamically and when the variables are no more used then claim it back.
Heaps memory is allocated during the execution of programmers' instructions. It is crucial to highlight that the name heap has nothing to do with the heap data structure. It is termed a heap because it is a collection of memory space that programmers can allocate and deallocate.
Heap space is used for the dynamic memory allocation of Java objects and JRE classes at runtime. New objects are always created in heap space, and the references to these objects are stored in stack memory.
The advantages of heap memory are: Lifetime. Because the programmer now controls exactly when memory is allocated, it is possible to build a data structure in memory, and return that data structure to the caller. This was never possible with local memory, which was automatically deallocated when the function exited.
I don't know, off the top of my hat, of any canned implementation that can be used. However, this doesn't appear to be particularly difficult to implement yourself, using just the garden-variety containers in the C++ standard library.
I would recommend a simple approach that uses two std::map
s and one std::multimap
. Let's say that bufaddr_t
is an opaque integer that represents an address in the external buffer. Since we're talking about 16 gig buffer, it has to be a 64 bit address:
typedef uint64_t memblockaddr_t;
Ditto for the size of an allocated block.
typedef uint64_t memblocksize_t;
You could, I suppose, use something else for memblockaddr_t
, as long as the opaque data type has strict weak ordering.
The first part is easy. Keeping track of all allocated blocks:
std::map<memblockaddr_t, memblocksize_t> allocated;
So, when you successfully allocate a chunk of memory in the external buffer, you insert it here. When you wish to deallocate a chunk of memory, you look up the size of the allocated block here, and remove the map entry. Simple enough.
But that, of course, is not the whole story. Now, we need to keep track of available, unallocated memory blocks. Let's do it this way:
typedef std::multimap<memblocksize_t, memblockaddr_t> unallocated_t;
unallocated_t unallocated;
std::map<memblockaddr_t, unallocated_t::iterator> unallocated_lookup;
unallocated
is a collection of all unallocated chunks in your external buffer, keyed by chunk size. The key is the chunk size. So, when you need to allocate a chunk of memory of a particular size, you can simply use the lower_bound()
method (or upper_bound()
, if you prefer) to immediately find the first chunk whose size is as least as much as you want to allocate.
And since you can, of course, have many chunks of the same size, unallocated
has to be a std::multimap
.
Also, unallocated_lookup
is a map keyed by the address of each unallocated chunk, that gives you the iterator for this chunk's entry in unallocated
. Why you need that will become clear in a moment.
So:
Initialize a new, completely unallocated buffer with a single entry:
memblockaddr_t beginning=0; // Or, whatever represents the start of the buffer.
auto p=unallocated.insert(std::make_pair(BUFFER_SIZE, beginning)).first;
unallocated_lookup.insert(std::make_pair(beginning, p));
Then:
To allocate a block, use the lower_bound()/upper_bound() approach as I mentioned above to find the first unallocated chunk that's at least as big and remove its entry from unallocated
and unallocated_lookup
. If it's more than what you need, return the excess back to the pool as if the extra amount that you don't need is being deallocated (step 3 below). Finally, insert it into the allocated
array, so you remember how large the allocated chunk is.
To deallocate a block, look it up in the allocated
array, to get its size, remove it from the allocated
array, then:
Insert it into the unallocated
and unallocated_lookup
, similar to how the initial unallocated chunk was inserted, see above.
But you're not done yet. You must then use unallocated_lookup
to trivially look up the preceding unallocated chunk and the following unallocated chunk in the memory buffer. If either or both of them are immediately adjacent to the newly-freed chunk, you must coalesce them together. That should be a very obvious process. You can simply go through the motions of officially removing the adjacent chunks individually, from unallocated
and unallocated_lookup
, then freeing a single, coalesced chunk.
That's the real purpose of unallocated_lookup
, to be able to easily coalesce contiguous unallocated chunks.
As far as I can tell, all of the above operations carry logarithmic complexity. They're entirely based on std::map
's and std::multimap
's methods that have logarithmic complexity, and nothing else.
Finally:
Depending your application's behavior, you can easily tweak the implementation to internally round up the size of an allocated chunk to whatever multiple you wish. Or adjust the allocation strategy -- allocate from the smallest chunk that's big enough to meet the allocation request, or just allocate from the large unallocated chunk (simple, use end()
to find it), etc...
That's one advantage to rolling your own implementation -- you'll always have much greater flexibility to tweak your own implementation, then you would typically have with some canned external library.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With