Why use _mm_malloc? (as opposed to _aligned_malloc, alligned_alloc, or posix_memalign)

Tags:

There are a few options for acquiring an aligned block of memory but they're very similar and the issue mostly boils down to what language standard and platforms you're targeting.

C11

void * aligned_alloc (size_t alignment, size_t size)

POSIX

int posix_memalign (void **memptr, size_t alignment, size_t size)

Windows

void * _aligned_malloc(size_t size, size_t alignment);

And of course it's also always an option to align by hand.

Intel offers another option.

Intel

void* _mm_malloc (int size, int align) void _mm_free (void *p)

Based on source code released by Intel, this seems to be the method of allocating aligned memory their engineers prefer but I can't find any documentation comparing it to other methods. The closest I found simply acknowledges that other aligned memory allocation routines exist.

https://software.intel.com/en-us/articles/memory-management-for-optimal-performance-on-intel-xeon-phi-coprocessor-alignment-and

To dynamically allocate a piece of aligned memory, use posix_memalign, which is supported by GCC as well as the Intel Compiler. The benefit of using it is that you don’t have to change the memory disposal API. You can use free() as you always do. But pay attention to the parameter profile:

int posix_memalign (void **memptr, size_t align, size_t size);

The Intel Compiler also provides another set of memory allocation APIs. C/C++ programmers can use _mm_malloc and _mm_free to allocate and free aligned blocks of memory. For example, the following statement requests a 64-byte aligned memory block for 8 floating point elements.

farray = (float *)__mm_malloc(8*sizeof(float), 64);

Memory that is allocated using _mm_malloc must be freed using _mm_free. Calling free on memory allocated with _mm_malloc or calling _mm_free on memory allocated with malloc will result in unpredictable behavior.

The clear differences from a user perspective is that _mm_malloc requires direct CPU and compiler support and memory allocated with _mm_malloc must be freed with _mm_free. Given these drawbacks, what is the reason for ever using _mm_malloc? Can it have a slight performance advantage? Historical accident?

356

asked Sep 16 '15 15:09

Praxeolitic

2 Answers

Intel compilers support POSIX (Linux) and non-POSIX (Windows) operating systems, hence cannot rely upon either the POSIX or the Windows function. Thus, a compiler-specific but OS-agnostic solution was chosen.

C11 is a great solution but Microsoft doesn't even support C99 yet, so who knows if they will ever support C11.

Update: Unlike the C11/POSIX/Windows allocation functions, the ICC intrinsics include a deallocation function. This allows this API to use a separate heap manager from the default one. I don't know if/when it actually does that, but it can be useful to support this model.

Disclaimer: I work for Intel but have no special knowledge of these decisions, which happened long before I joined the company.

answered Sep 20 '22 12:09

Jeff Hammond

It's possible to take an existing C compiler which does not presently happen to use the identifiers _mm_alloc and _mm_free and define functions with those names which will behave as required. This could be done either by having _mm_alloc function as a wrapper on malloc() which asks for a slightly-oversized allocation and constructs a pointer to the first suitably-aligned address within it that's at least one byte from the beginning, and storing the number of bytes skipped immediately before that address, or by having _mm_malloc request large chunks of memory from malloc() and then dispense them piecemeal. In any case, the pointers returned by _mm_malloc() would not be pointers that free() would generally know how to do anything with; calling _mm_free would use the byte immediately preceding the allocation as an aid to finding the real start of the allocation received from malloc, and then pass that do free.

If an aligned-allocate function is allowed to use the internals of the malloc and free functions, however, that may eliminate the need for the extra layer of wrapping. It's possible to write _mm_alloc()/_mm_free() functions which wraps malloc/free without knowing anything about their internals, but it requires that _mm_alloc() keep book-keeping information which is separate from that used by malloc/free.

If the author of an aligned-allocate function knows how malloc and free are implemented, it will often be possible to coordinate the design of all the allocation/free functions so that free can distinguish all kinds of allocations and handle them appropriately. No single aligned-allocate implementation would be usable on all malloc/free implementations, however.

I would suggest that the most portable way to write code would probably be to select a couple of symbols that are not used anywhere else for your own allocate and free functions, so that you could then say, e.g.

#define a_alloc(align,sz) _mm_alloc((align),(sz)) #define a_free(ptr)  _mm_free((ptr))

on compilers that support that, or

static inline void *aa_alloc(int align, int size) {   void *ret=0;   posix_memalign(&ret, align, size); // Guessing here   return ret; } #define a_alloc(align,sz) aa_alloc((align),(sz)) #define a_free(ptr)  free((ptr))

on Posix systems, etc. For every system it should be possible to define macros or functions that will yield the necessary behavior [I think it's probably better to use macros consistently than to sometimes use macros and sometimes functions, so as to allow #if defined macroname to test whether things are defined yet].

answered Sep 19 '22 12:09

supercat

Related questions
                            
                                What does '**' mean in C?
                            
                                Function defined but not used warning in C
                            
                                Checking if a file is a directory or just a file [duplicate]
                            
                                Which program creates a C array given any file?
                            
                                In either C or C++, should I check pointer parameters against NULL/nullptr?
                            
                                Making some text in printf appear in green and red
                            
                                Why was Cassandra written in Java? [closed]
                            
                                C 64-bit loop performance on x86
                            
                                structs in C with initial values [duplicate]
                            
                                Helgrind (Valgrind) and OpenMP (C): avoiding false positives?
                            
                                int main() vs void main() in C [duplicate]
                            
                                C/C++ HTTP Client Library for Embedded Projects [closed]
                            
                                Paint Pixels to Screen via Linux FrameBuffer
                            
                                What does mean for a name or type to have a certain language linkage?
                            
                                gcc -g vs not -g and strip vs not strip, performance and memory usage?
                            
                                What is the encoding of argv?
                            
                                Can you capitalize a pasted token in a macro?
                            
                                Is ((void*)0) a null pointer constant?
                            
                                What exactly is a type cast in C/C++?
                            
                                Why is "\?" an escape sequence in C/C++?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why use _mm_malloc? (as opposed to _aligned_malloc, alligned_alloc, or posix_memalign)

Tags:

c

memory-management

dynamic-memory-allocation

intel

Praxeolitic

People also ask

2 Answers

Jeff Hammond

supercat

Recent Activity

Donate For Us