Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

suggestions for improving an allocator algorithm implementation

I have a Visual Studio 2008 C++ application where I'm using a custom allocator for standard containers such that their memory comes from a Memory Mapped File rather than the heap. This allocator is used for 4 different use cases:

  1. 104-byte fixed size structure std::vector< SomeType, MyAllocator< SomeType > > foo;
  2. 200-byte fixed size structure
  3. 304-byte fixed size structure
  4. n-byte strings std::basic_string< char, std::char_traits< char >, MyAllocator< char > > strn;

I need to be able to allocate roughly 32MB total for each of these.

The allocator tracks memory usage using a std::map of pointers to allocation size. typedef std::map< void*, size_t > SuperBlock; Each SuperBlock represents 4MB of memory.

There is a std::vector< SuperBlock > of these in case one SuperBlock isn't enough space.

The algorithm used for the allocator goes like this:

  1. For each SuperBlock: Is there space at the end of the SuperBlock? put the allocation there. (fast)
  2. If not, search within each SuperBlock for an empty space of sufficient size and put the allocation there. (slow)
  3. Still nothing? allocate another SuperBlock and put the allocation at the start of the new SuperBlock.

Unfortunately, step 2 can become VERY slow after a while. As copies of objects are made and temporary variables destroyed I get a lot of fragmentation. This causes a lot of deep searching within the memory structure. Fragmentation is in issue as I have a limited amount of memory to work with (see note below)

Can anybody suggest improvements to this algorithm that would speed up the process? Do I need two separate algorithms (1 for the fixed-size allocations and one for the string allocator)?

Note: For those that need a reason: I'm using this algorithm in Windows Mobile where there's a 32MB process slot limit to the Heap. So, the usual std::allocator won't cut it. I need to put the allocations in the 1GB Large Memory Area to have enough space and that's what this does.

like image 539
PaulH Avatar asked May 17 '11 16:05

PaulH


3 Answers

Can you have a separate memory allocation pool for each different fixed-size type you are allocating? That way there won't be any fragmentation, because the allocated objects will always align on n-byte boundaries. That doesn't help for the variable-length strings, of course.

There's an example of small-object allocation in Alexandrescu's Modern C++ design that illustrates this principle and may give you some ideas.

like image 111
Tim Martin Avatar answered Oct 17 '22 01:10

Tim Martin


For the fixed sized objects, you can create a fixed sized allocator. Basically you allocate blocks, partition into subblocks of the appropriate size and create a linked list with the result. Allocating from such a block is O(1) if there is memory available (just remove the first element from the list and return a pointer to it) as is deallocation (add the block to the free list). During allocation, if the list is empty, grab a new superblock, partition and add all blocks into the list.

For the variable sized list, you can simplify it to the fixed size block by allocating only blocks of known sizes: 32 bytes, 64 bytes, 128 bytes, 512 bytes. You will have to analyze the memory usage to come up with the different buckets so that you don't waste too much memory. For large objects, you can go back to a dynamic size allocation pattern, that will be slow, but hopefully the amount of large objects is limited.

like image 2
David Rodríguez - dribeas Avatar answered Oct 17 '22 03:10

David Rodríguez - dribeas


Building on Tim's answer, I would personally use something akin to BiBOP.

The basic idea is simple: use fixed-size pools.

There are some refinements to that.

First, the size of the pools is generally fixed. It depends on your allocation routine, typically if you know the OS you're working on map at least 4KB at once when you use malloc, then you use that value. For a memory mapped file, you might be able to increase this.

The advantage of fixed-size pools is that it nicely fights off fragmentation. All pages being of the same size, you can recycle an empty 256-bytes page into a 128-bytes page easily.

There is still some fragmentation for large objects, which are typically allocated outside of this system. But it's low, especially if you fit large objects into a multiple of the page size, this way the memory will be easy to recycle.

Second, how to handle the pools ? Using linked-lists.

The pages are typically untyped (by themselves), so you have a free-list of pages in which to prepare new pages and put "recycled" pages.

For each size category you then have a list of "occupied" pages, in which memory has been allocated. For each page you keep:

  • the allocation size (for this page)
  • the number of allocated objects (to check for emptiness)
  • a pointer to the first free cell
  • a pointer to the previous and next pages (might point to the "head" of the list)

Each free-cell is itself a pointer (or index, depending on the size you have) to the next free-cell.

The list of "occupied" pages of a given size is simply managed:

  • on deletion: if you empty the page, then remove it from the list and push it into the recycled pages, otherwise, update the free-cell list of this page (note: finding the beginning of the current page is usually a simple modulo operation on the address)
  • on insertion: search starting from head, as soon as you find a non-full page, move it in front of the list (if it's not already) and insert your item

This scheme is really performant memory-wise, with only a single page reserved for indexing.

For multi-threaded / multi-processes applications, you'll need to add synchronization (a mutex per page typically), in case you could get inspiration from Google's tcmalloc (try and find another page instead of blocking, use a thread-local cache to remember which page you last used).

Having said that, have you tried Boost.Interprocess ? It provides allocators.

like image 2
Matthieu M. Avatar answered Oct 17 '22 01:10

Matthieu M.