I am writing a library for various mathematical computations in C. Several of these need some "scratch" space -- memory that is used for intermediate calculations. The space required depends on the size of the inputs, so it cannot be statically allocated. The library will typically be used to perform many iterations of the same type of calculation with the same size inputs, so I'd prefer not to <code>malloc</code> and <code>free</code> inside the library for each call; it would be much more efficient to allocate a large enough block once, re-use it for all the calculations, then free it. My intended strategy is to request a <code>void</code> pointer to a single block of memory, perhaps with an accompanying allocation function. Say, something like this: <pre class="prettyprint"><code>void *allocateScratch(size_t rows, size_t columns); void doCalculation(size_t rows, size_t columns, double *data, void *scratch); </code></pre> The idea is that if the user intends to do several calculations of the same size, he may use the allocate function to grab a block that is large enough, then use that same block of memory to perform the calculation for each of the inputs. The allocate function is not strictly necessary, but it simplifies the interface and makes it easier to change the storage requirements in the future, without each user of the library needing to know exactly how much space is required. In many cases, the block of memory I need is just a large array of type <code>double</code>, no problems there. But in some cases I need mixed data types -- say a block of doubles AND a block of integers. My code needs to be portable and should conform to the ANSI standard. I know that it is OK to cast a <code>void</code> pointer to any other pointer type, but I'm concerned about alignment issues if I try to use the same block for two types. So, specific example. Say I need a block of 3 <code>double</code>s and 5 <code>int</code>s. Can I implement my functions like this: <pre class="prettyprint"><code>void *allocateScratch(...) { return malloc(3 * sizeof(double) + 5 * sizeof(int)); } void doCalculation(..., void *scratch) { double *dblArray = scratch; int *intArray = ((unsigned char*)scratch) + 3 * sizeof(double); } </code></pre> Is this legal? The alignment probably works out OK in this example, but what if I switch it around and take the <code>int</code> block first and the <code>double</code> block second, that will shift the alignment of the <code>double</code>'s (assuming 64-bit doubles and 32-bit ints). Is there a better way to do this? Or a more standard approach I should consider? My biggest goals are as follows: <ul> <li>I'd like to use a single block if possible so the user doesn't have to deal with multiple blocks or a changing number of blocks required.</li> <li>I'd like the block to be a valid block obtained by <code>malloc</code> so the user can call <code>free</code> when finished. This means I don't want to do something like creating a small <code>struct</code> that has pointers to each block and then allocating each block separately, which would require a special destroy function; I'm willing to do that if that's the "only" way.</li> <li>The algorithms and memory requirements may change, so I'm trying to use the allocate function so that future versions can get different amounts of memory for potentially different types of data without breaking backward compatibility.</li> </ul> Maybe this issue is addressed in the C standard, but I haven't been able to find it.

If the user is calling your library's allocation function, then they should call your library's freeing function. This is very typical (and good) interface design. So I would say just go with the struct of pointers to different pools for your different types. That's clean, simple, and portable, and anybody who reads your code will see exactly what you are up to. If you do not mind wasting memory and insist on a single block, you could create a union with all of your types and then allocate an array of those... Trying to find appropriately aligned memory in a massive block is just a mess. I am not even sure you can do it portably. What's the plan? Cast pointers to <code>intptr_t</code>, do some rounding, then cast back to a pointer?

What alignment issues limit the use of a block of memory created by malloc?

Tags:

c

dynamic-memory-allocation

memory

memory-alignment

I am writing a library for various mathematical computations in C. Several of these need some "scratch" space -- memory that is used for intermediate calculations. The space required depends on the size of the inputs, so it cannot be statically allocated. The library will typically be used to perform many iterations of the same type of calculation with the same size inputs, so I'd prefer not to malloc and free inside the library for each call; it would be much more efficient to allocate a large enough block once, re-use it for all the calculations, then free it.

My intended strategy is to request a void pointer to a single block of memory, perhaps with an accompanying allocation function. Say, something like this:

void *allocateScratch(size_t rows, size_t columns);
void doCalculation(size_t rows, size_t columns, double *data, void *scratch);

The idea is that if the user intends to do several calculations of the same size, he may use the allocate function to grab a block that is large enough, then use that same block of memory to perform the calculation for each of the inputs. The allocate function is not strictly necessary, but it simplifies the interface and makes it easier to change the storage requirements in the future, without each user of the library needing to know exactly how much space is required.

In many cases, the block of memory I need is just a large array of type double, no problems there. But in some cases I need mixed data types -- say a block of doubles AND a block of integers. My code needs to be portable and should conform to the ANSI standard. I know that it is OK to cast a void pointer to any other pointer type, but I'm concerned about alignment issues if I try to use the same block for two types.

So, specific example. Say I need a block of 3 doubles and 5 ints. Can I implement my functions like this:

void *allocateScratch(...) {
    return malloc(3 * sizeof(double) + 5 * sizeof(int));
}

void doCalculation(..., void *scratch) {
    double *dblArray = scratch;
    int *intArray = ((unsigned char*)scratch) + 3 * sizeof(double);
}

Is this legal? The alignment probably works out OK in this example, but what if I switch it around and take the int block first and the double block second, that will shift the alignment of the double's (assuming 64-bit doubles and 32-bit ints). Is there a better way to do this? Or a more standard approach I should consider?

My biggest goals are as follows:

I'd like to use a single block if possible so the user doesn't have to deal with multiple blocks or a changing number of blocks required.
I'd like the block to be a valid block obtained by malloc so the user can call free when finished. This means I don't want to do something like creating a small struct that has pointers to each block and then allocating each block separately, which would require a special destroy function; I'm willing to do that if that's the "only" way.
The algorithms and memory requirements may change, so I'm trying to use the allocate function so that future versions can get different amounts of memory for potentially different types of data without breaking backward compatibility.

Maybe this issue is addressed in the C standard, but I haven't been able to find it.

945

asked Jan 15 '14 06:01

Jeremy West

4 Answers

The memory of a single malloc can be partitioned for use in multiple arrays as shown below.

Suppose we want arrays of types A, B, and C with NA, NB, and NC elements. We do this:

size_t Offset = 0;

ptrdiff_t OffsetA = Offset;           // Put array at current offset.
Offset += NA * sizeof(A);             // Move offset to end of array.

Offset = RoundUp(Offset, sizeof(B));  // Align sufficiently for type.
ptrdiff_t OffsetB = Offset;           // Put array at current offset.
Offset += NB * sizeof(B);             // Move offset to end of array.

Offset = RoundUp(Offset, sizeof(C));  // Align sufficiently for type.
ptrdiff_t OffsetC = Offset;           // Put array at current offset.
Offset += NC * sizeof(C);             // Move offset to end of array.

unsigned char *Memory = malloc(Offset);  // Allocate memory.

// Set pointers for arrays.
A *pA = Memory + OffsetA;
B *pB = Memory + OffsetB;
C *pC = Memory + OffsetC;

where RoundUp is:

// Return Offset rounded up to a multiple of Size.
size_t RoundUp(size_t Offset, size_t Size)
{
    size_t x = Offset + Size - 1;
    return x - x % Size;
}

This uses the fact, as noted by R.., that the size of a type must be a multiple of the alignment requirement for that type. In C 2011, sizeof in the RoundUp calls can be changed to _Alignof, and this may save a small amount of space when the alignment requirement of a type is less than its size.

answered Oct 17 '22 09:10

Eric Postpischil

If the user is calling your library's allocation function, then they should call your library's freeing function. This is very typical (and good) interface design.

So I would say just go with the struct of pointers to different pools for your different types. That's clean, simple, and portable, and anybody who reads your code will see exactly what you are up to.

If you do not mind wasting memory and insist on a single block, you could create a union with all of your types and then allocate an array of those...

Trying to find appropriately aligned memory in a massive block is just a mess. I am not even sure you can do it portably. What's the plan? Cast pointers to intptr_t, do some rounding, then cast back to a pointer?

answered Oct 17 '22 08:10

Nemo

The latest C11 standard has the max_align_t type (and _Alignas specifier and _Alignof operator and <stdalign.h> header).

GCC compiler has a __BIGGEST_ALIGNMENT__ macro (giving the maximal size alignment). It also proves some extensions related to alignment.

Often, using 2*sizeof(void*) (as the biggest relevant alignment) is in practice quite safe (at least on most of the systems I heard about these days; but one could imagine weird processors and systems where it is not the case, perhaps some DSP-s). To be sure, study the details of the ABI and calling conventions of your particular implementation, e.g. x86-64 ABI and x86 calling conventions...

And the system malloc is guaranteed to return a sufficiently aligned pointer (for all purposes).

On some systems and targets and some processors giving a larger alignment might give performance benefit (notably when asking the compiler to optimize). You may have to (or want to) tell the compiler about that, e.g. on GCC using variable attributes...

Don't forget that according to Fulton

there is no such thing as portable software, only software that has been ported.

but intptr_t and max_align_t is here to help you....

answered Oct 17 '22 09:10

Basile Starynkevitch

Note that the required alignment for any type must evenly divide the size of the type; this is a consequence of the representation of array types. Thus, in the absence of C11 features to determine the required alignment for a type, you can just estimate conservatively and use the type's size. In other words, if you want to carve up part of an allocation from malloc for use storing doubles, make sure it starts at an offset that's a multiple of sizeof(double).

answered Oct 17 '22 08:10

R.. GitHub STOP HELPING ICE

Related questions
                            
                                Macro expansion of __typeof__ to function name
                            
                                How to connect two computers over internet using socket programming in C?
                            
                                Does using linux capabilities disable LD_PRELOAD
                            
                                Warning - Integer operation result is out of range in c
                            
                                Semantic, cedet how to force parsing of source files
                            
                                What aspects of signed left shift are undefined with GCC?
                            
                                linux execvp ; ls cannot access |, No such file or directory
                            
                                Calling function pointed by void* pointer
                            
                                Why isn't C array initialization syntax allowed for arbitrary assignments?
                            
                                How are arrays of Java objects tenured?
                            
                                Pass integer value through pthread_create
                            
                                How to declare and initialize in a 4-dimensional array in C
                            
                                How do I compile Pyparsing with Cython on WIndows?
                            
                                Is it possible to determine if a symbol is a variable or function in C?
                            
                                Read and write hard disk sector directly and efficiently [duplicate]
                            
                                Unable to claim USB interface with C + libusb on Mac OS X
                            
                                What's the space complexity of a radix tree?
                            
                                My program is giving different output on different machines..!
                            
                                What is overlapping in memmove() definition?
                            
                                Floating point calculations in a processor with no FPU

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With