Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I want to make my own Malloc

I want to make my own malloc/free so I can make a precise copying allocator.

Any gurus have any tips and suggestions?

I have a few questions for now:

  1. Should I just malloc large chunks of memory, and then distribute from that so I don't have to call the system calls?
  2. How are copying collectors usually done? I would imagine that this part is a bit complicated to do efficiently. My naive implementation would just malloc a block the size of the remaining objects which would require 2x the space.
like image 930
Unknown Avatar asked Apr 09 '09 02:04

Unknown


1 Answers

There's rather a lot of good literature on implementing malloc and similar things. but I notice that you include C++ here -- are you aware that you can write your own implementation of new and delete in C++? That might be useful as a way to do it easily.

In any case, the characteristics you want are going to depend pretty heavily on your workload, that is, on the pattern of usage over time. If you have only mallocs and new frees, it's easy, obviously. If you have only mallocs of one, or a few different, block sizes, that's also simple.

In other languages, you get some leverage by having the ability to chain memory together, but C isn't that smart.

The basic implementation of malloc simply allocates a header that contains the data length, an "in use flag", and the malloced memory. Malloc then constructs a new header at the end of its space, allocates the memory, and returns a pointer. When you free, it just resets the in use flag.

The trick is that when you do a lot of mallooc and free, you can quickly get a lot of small blobs that aren't in use, but are hard to allocate. So you need some kind of bumpo gc to merge blocks of memory.

You could do a more complicated gc, but remember that takes time; you don't want a free to take up a lot of time.

There's a nice paper on Solaris malloc implementations you might find interesting. Here's another on building an alternative malloc, again in Solaris, but the basics are the same. And you should read the Wikipedia article on garbage collection, and follow it to some of the more formal papers.

Update

You know, you really should have a look at generational garbage collectors. The basic idea is that the longer something remains allocated, the more likely is it to stay allocated. This is an extension of the "copying" GC you mention. Basically, you allocate new stuff in one part of your memory pool, call it g0. When you reach a high water mark on that, you look through the allocated blocks and copy the ones that are still in use to another section of memory, call it g1, Then you can just clear the g0 space and start allocating there. Eventually g1 gets to its high water mark and you fix that by clearing g0, and clean up g1 moving stuff to g0, and when you're done, you rename the old g1 as g0 and vice versa and continue.

The trick is that in C especially, the handles you hand out to malloc'ed memory are straight raw pointers; you can't really move things around without some heap big medicine.

Second update

In comments, @unknown asks "Wouldn't moving stuff around just be a memcpy()". And indeed it would. but consider this timeline:

warning: this is not complete, and untested, just for illustration, for entertainment only, no warranty express or implied

/* basic environment for illustration*/
void * myMemoryHdl ;
unsigned char lotsOfMemory[LOTS]; /* this will be your memory pool*/ 

You mallocate some memory

/* if we get past this, it succeded */
if((myMemoryHdl = newMalloc(SIZE)) == NULL)
    exit(-1);

In your implementation of malloc, you create the memory and return a pointer to the buffer.

unsigned char * nextUnusued = &lotsOfMemory[0];
int partitionSize = (int)(LOTS/2);
int hwm = (int) (partition/2);
/* So g0 will be the bottom half and g1 the top half to start */
unsigned char * g0 = &lotsOfMemory[0];
unsigned char * g1 = &lotsOfMemory[partitionSize];


void * newMalloc(size_t size){
   void * rtn ;
   if( /* memory COMPLETELY exhausted */)
      return NULL;
   /* otherwise */
   /* add header at nextUnused */
   newHeader(nextUnused);     /* includes some pointers for chaining
                               * and a field with values USED or FREE, 
                               * set to USED */
   nextUnused += HEADERLEN ;  /* this could be niftier */
   rtn = nextUnused ;
   nextUnused += size ;
}

Some of the things are freed

  newFree(void * aHandle){
     *(aHandle-offset) = FREE ; /* set the flag in the header, 
                                 * using an offset. */
  }

So now you do all the stuff and you get to your high water mark.

 for( /* each block in your memory pool */ )
    if( /* block header is still marked USED */ ) {
        memcpy(/* block into other partition */);
    }
 /* clear the partition */
 bzero(g0, partitionSize);

Now, go back to the original handle you saved in myMemHdl. What does it point to? (Answer, you just set it to 0x00 with bzero(3).)

That's where the magic comes in. In C at least, the pointer you returned from your malloc is no longer under your control -- you can't move it around after the fact. In C++, with user-defined pointer-like types, you can fix that.

like image 192
Charlie Martin Avatar answered Sep 28 '22 06:09

Charlie Martin