Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What alignment issues limit the use of a block of memory created by malloc?

I am writing a library for various mathematical computations in C. Several of these need some "scratch" space -- memory that is used for intermediate calculations. The space required depends on the size of the inputs, so it cannot be statically allocated. The library will typically be used to perform many iterations of the same type of calculation with the same size inputs, so I'd prefer not to malloc and free inside the library for each call; it would be much more efficient to allocate a large enough block once, re-use it for all the calculations, then free it.

My intended strategy is to request a void pointer to a single block of memory, perhaps with an accompanying allocation function. Say, something like this:

void *allocateScratch(size_t rows, size_t columns);
void doCalculation(size_t rows, size_t columns, double *data, void *scratch);

The idea is that if the user intends to do several calculations of the same size, he may use the allocate function to grab a block that is large enough, then use that same block of memory to perform the calculation for each of the inputs. The allocate function is not strictly necessary, but it simplifies the interface and makes it easier to change the storage requirements in the future, without each user of the library needing to know exactly how much space is required.

In many cases, the block of memory I need is just a large array of type double, no problems there. But in some cases I need mixed data types -- say a block of doubles AND a block of integers. My code needs to be portable and should conform to the ANSI standard. I know that it is OK to cast a void pointer to any other pointer type, but I'm concerned about alignment issues if I try to use the same block for two types.

So, specific example. Say I need a block of 3 doubles and 5 ints. Can I implement my functions like this:

void *allocateScratch(...) {
    return malloc(3 * sizeof(double) + 5 * sizeof(int));
}

void doCalculation(..., void *scratch) {
    double *dblArray = scratch;
    int *intArray = ((unsigned char*)scratch) + 3 * sizeof(double);
}

Is this legal? The alignment probably works out OK in this example, but what if I switch it around and take the int block first and the double block second, that will shift the alignment of the double's (assuming 64-bit doubles and 32-bit ints). Is there a better way to do this? Or a more standard approach I should consider?

My biggest goals are as follows:

  • I'd like to use a single block if possible so the user doesn't have to deal with multiple blocks or a changing number of blocks required.
  • I'd like the block to be a valid block obtained by malloc so the user can call free when finished. This means I don't want to do something like creating a small struct that has pointers to each block and then allocating each block separately, which would require a special destroy function; I'm willing to do that if that's the "only" way.
  • The algorithms and memory requirements may change, so I'm trying to use the allocate function so that future versions can get different amounts of memory for potentially different types of data without breaking backward compatibility.

Maybe this issue is addressed in the C standard, but I haven't been able to find it.

like image 945
Jeremy West Avatar asked Jan 15 '14 06:01

Jeremy West


People also ask

What is the alignment of malloc?

new and malloc, by default, align address to 8 bytes (x86) or 16 bytes (x64), which is the optimal for most complex data.

Is malloc memory aligned?

malloc() on macOS always returns memory that is 16 byte aligned, despite the fact that no data type on macOS has a memory alignment requirement beyond 8.

What is memory allocation alignment?

Alignment refers to the arrangement of data in memory, and specifically deals with the issue of accessing data as proper units of information from main memory. First we must conceptualize main memory as a contiguous block of consecutive memory locations. Each location contains a fixed number of bits.

How does malloc work in memory?

In C, the library function malloc is used to allocate a block of memory on the heap. The program accesses this block of memory via a pointer that malloc returns. When the memory is no longer needed, the pointer is passed to free which deallocates the memory so that it can be used for other purposes.


4 Answers

The memory of a single malloc can be partitioned for use in multiple arrays as shown below.

Suppose we want arrays of types A, B, and C with NA, NB, and NC elements. We do this:

size_t Offset = 0;

ptrdiff_t OffsetA = Offset;           // Put array at current offset.
Offset += NA * sizeof(A);             // Move offset to end of array.

Offset = RoundUp(Offset, sizeof(B));  // Align sufficiently for type.
ptrdiff_t OffsetB = Offset;           // Put array at current offset.
Offset += NB * sizeof(B);             // Move offset to end of array.

Offset = RoundUp(Offset, sizeof(C));  // Align sufficiently for type.
ptrdiff_t OffsetC = Offset;           // Put array at current offset.
Offset += NC * sizeof(C);             // Move offset to end of array.

unsigned char *Memory = malloc(Offset);  // Allocate memory.

// Set pointers for arrays.
A *pA = Memory + OffsetA;
B *pB = Memory + OffsetB;
C *pC = Memory + OffsetC;

where RoundUp is:

// Return Offset rounded up to a multiple of Size.
size_t RoundUp(size_t Offset, size_t Size)
{
    size_t x = Offset + Size - 1;
    return x - x % Size;
}

This uses the fact, as noted by R.., that the size of a type must be a multiple of the alignment requirement for that type. In C 2011, sizeof in the RoundUp calls can be changed to _Alignof, and this may save a small amount of space when the alignment requirement of a type is less than its size.

like image 70
Eric Postpischil Avatar answered Oct 17 '22 09:10

Eric Postpischil


If the user is calling your library's allocation function, then they should call your library's freeing function. This is very typical (and good) interface design.

So I would say just go with the struct of pointers to different pools for your different types. That's clean, simple, and portable, and anybody who reads your code will see exactly what you are up to.

If you do not mind wasting memory and insist on a single block, you could create a union with all of your types and then allocate an array of those...

Trying to find appropriately aligned memory in a massive block is just a mess. I am not even sure you can do it portably. What's the plan? Cast pointers to intptr_t, do some rounding, then cast back to a pointer?

like image 44
Nemo Avatar answered Oct 17 '22 08:10

Nemo


The latest C11 standard has the max_align_t type (and _Alignas specifier and _Alignof operator and <stdalign.h> header).

GCC compiler has a __BIGGEST_ALIGNMENT__ macro (giving the maximal size alignment). It also proves some extensions related to alignment.

Often, using 2*sizeof(void*) (as the biggest relevant alignment) is in practice quite safe (at least on most of the systems I heard about these days; but one could imagine weird processors and systems where it is not the case, perhaps some DSP-s). To be sure, study the details of the ABI and calling conventions of your particular implementation, e.g. x86-64 ABI and x86 calling conventions...

And the system malloc is guaranteed to return a sufficiently aligned pointer (for all purposes).

On some systems and targets and some processors giving a larger alignment might give performance benefit (notably when asking the compiler to optimize). You may have to (or want to) tell the compiler about that, e.g. on GCC using variable attributes...

Don't forget that according to Fulton

there is no such thing as portable software, only software that has been ported.

but intptr_t and max_align_t is here to help you....

like image 2
Basile Starynkevitch Avatar answered Oct 17 '22 09:10

Basile Starynkevitch


Note that the required alignment for any type must evenly divide the size of the type; this is a consequence of the representation of array types. Thus, in the absence of C11 features to determine the required alignment for a type, you can just estimate conservatively and use the type's size. In other words, if you want to carve up part of an allocation from malloc for use storing doubles, make sure it starts at an offset that's a multiple of sizeof(double).

like image 1
R.. GitHub STOP HELPING ICE Avatar answered Oct 17 '22 08:10

R.. GitHub STOP HELPING ICE