Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aligned malloc in C++

Tags:

c++

c

malloc

I have a question on problem 13.9 in the book, "cracking the coding interview". The question is to write an aligned alloc and free function that supports allocating memory, and in the answer the code is given below:

void *aligned_malloc(size_t required_bytes, size_t alignment) {
  void *p1;
  void **p2;
  int offset=alignment-1+sizeof(void*);
  if((p1=(void*)malloc(required_bytes+offset))==NULL)
  return NULL;
  p2=(void**)(((size_t)(p1)+offset)&~(alignment-1));  //line 5
  p2[-1]=p1; //line 6
  return p2;
}

I am so confused with the line 5 and line 6. Why do you have to do an "and" since you have already add offset to p1? and what does [-1] mean? Thanks for the help in advance.

like image 710
Ming Avatar asked Sep 20 '12 00:09

Ming


2 Answers

Your sample code is not complete. It allocates nothing. It is pretty obvious you are missing a malloc statement, which sets the p1 pointer. I don't have the book, but I think the complete code should goes along these lines:

void *aligned_malloc(size_t required_bytes, size_t alignment) {
    void *p1;
    void **p2;
    int offset=alignment-1+sizeof(void*);
    p1 = malloc(required_bytes + offset);               // the line you are missing
    p2=(void**)(((size_t)(p1)+offset)&~(alignment-1));  //line 5
    p2[-1]=p1; //line 6
    return p2;
}

So ... what does the code do?

  • The strategy is to malloc more space than what we need (into p1), and return a p2 pointer somewhere after the beginning of the buffer.
  • Since alignment is a power of two, in binary it has to form of 1 followed by zeros. e.g. if alignment is 32, it will be 00100000 in binary
  • (alignment-1) in binary format will turn the 1 into 0, and all the 0's after the 1 into 1. For example: (32-1) is 00011111
  • the ~ will reverse all the bits. That is: 11100000
  • now, p1 is a pointer to the buffer (remember, the buffer is larger by offset than what we need). we add offset to p1: p1+offset.
  • So now, (p1+offset) is greater that what we want to return. We'll nil all the insignificant bits by doing a bitwise and: (p1+offset) & ~(offset-1)
  • This is p2, the pointer that we want to return. Note that because its last 5 digits are zero it is 32 aligned, as requested.
  • p2 is what we'll return. However, we must be able to reach p1 when the user calls aligned_free. For this, note that we reserved location for one extra pointer when we calculated the offset (that's the sizeof(void*) in line 4.
  • so, we want to put p1 immediately before p2. This is p2[-1]. This is a little bit tricky calculation. Remember that p2 is defined as void*. One way to look at it is as array of void. C array calculation say that p2[0] is exactly p2. p2[1] is p2 + sizeof(void*). In general p2[n] = p2 + n*sizeof(void*). The compiler also supports negative numbers, so p2[-1] is one void* (typically 4 bytes) before p2.

I'm going to guess that aligned_free is something like:

void aligned_free( void* p ) {
    void* p1 = ((void**)p)[-1];         // get the pointer to the buffer we allocated
    free( p1 );
}
like image 136
Uri Avatar answered Sep 28 '22 05:09

Uri


p1 is the actual allocation. p2 is the pointer being returned, which references memory past the point of allocation and leaves enough space for both allignment AND storing the actual allocated pointer in the first place. when aligned_free() is called, p1 will be retrieved to do the "real" free().

Regarding the bit math, that gets a little more cumbersome, but it works.

p2=(void**)(((size_t)(p1)+offset)&~(alignment-1));  //line 5

Remember, p1 is the actual allocation reference. For kicks, lets assume the following, with 32bit pointers:

alignment = 64 bytes, 0x40
offset = 0x40-1+4 = 0x43
p1 = 0x20000110, a value returned from the stock malloc()

What is important is the original malloc() that allocates an additional 0x43 bytes of space above and beyond the original request. This is to ensure both the alignment math and the space for the 32bit pointer can be accounted for:

p2=(void**)(((size_t)(p1)+offset)&~(alignment-1));  //line 5
p2 = (0x20000110 + 0x43) &~ (0x0000003F)
p2 = 0x20000153 &~ 0x0000003F
p2 = 0x20000153 & 0xFFFFFFC0
p2 = 0x20000140

p2 is aligned on a 0x40 boundary (i.e. all bits in 0x3F are 0) and enough space is left behind to store the 4-byte pointer for the original allocation, referenced by p1.

It has been forever since i did alignment math, so if i f'ed up the bits, please someone correct this.

like image 43
WhozCraig Avatar answered Sep 28 '22 06:09

WhozCraig