Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to determine if memory is aligned?

I am new to optimizing code with SSE/SSE2 instructions and until now I have not gotten very far. To my knowledge a common SSE-optimized function would look like this:

void sse_func(const float* const ptr, int len){     if( ptr is aligned )     {         for( ... ){             // unroll loop by 4 or 2 elements         }         for( ....){             // handle the rest             // (non-optimized code)         }     } else {         for( ....){             // regular C code to handle non-aligned memory         }     } } 

However, how do I correctly determine if the memory ptr points to is aligned by e.g. 16 Bytes? I think I have to include the regular C code path for non-aligned memory as I cannot make sure that every memory passed to this function will be aligned. And using the intrinsics to load data from unaligned memory into the SSE registers seems to be horrible slow (Even slower than regular C code).

Thank you in advance...

like image 814
user229898 Avatar asked Dec 13 '09 23:12

user229898


People also ask

What does it mean for memory to be aligned?

Alignment refers to the arrangement of data in memory, and specifically deals with the issue of accessing data as proper units of information from main memory. First we must conceptualize main memory as a contiguous block of consecutive memory locations. Each location contains a fixed number of bits.

How do I know if my address is aligned in word?

In any case, you simply mentally calculate addr%word_size or addr&(word_size - 1) , and see if it is zero. When the address is hexadecimal, it is trivial: just look at the rightmost digit, and see if it is divisible by word size. For a word size of 4 bytes, second and third addresses of your examples are unaligned.

How do I know if my address is 16 byte aligned?

If the address is 16 byte aligned, these must be zero. Notice the lower 4 bits are always 0. The cryptic if statement now becomes very clear and intuitive. We simply mask the upper portion of the address, and check if the lower 4 bits are zero.

How do I know if my address is 4k aligned?

4k memory segments start at a hex address ending with 000. So all the addresses that end with 000 start on a 4 k boundary. However, addresses that end with 0000, 2000, 4000, 6000, 8000, a000, c000, or e000 also start on an 8k boundary. This is because hex 1000 is 4k or 2^12.


2 Answers

#define is_aligned(POINTER, BYTE_COUNT) \     (((uintptr_t)(const void *)(POINTER)) % (BYTE_COUNT) == 0) 

The cast to void * (or, equivalenty, char *) is necessary because the standard only guarantees an invertible conversion to uintptr_t for void *.

If you want type safety, consider using an inline function:

static inline _Bool is_aligned(const void *restrict pointer, size_t byte_count) { return (uintptr_t)pointer % byte_count == 0; } 

and hope for compiler optimizations if byte_count is a compile-time constant.

Why do we need to convert to void * ?

The C language allows different representations for different pointer types, eg you could have a 64-bit void * type (the whole address space) and a 32-bit foo * type (a segment).

The conversion foo * -> void * might involve an actual computation, eg adding an offset. The standard also leaves it up to the implementation what happens when converting (arbitrary) pointers to integers, but I suspect that it is often implemented as a noop.

For such an implementation, foo * -> uintptr_t -> foo * would work, but foo * -> uintptr_t -> void * and void * -> uintptr_t -> foo * wouldn't. The alignment computation would also not work reliably because you only check alignment relative to the segment offset, which might or might not be what you want.

In conclusion: Always use void * to get implementation-independant behaviour.

like image 144
Christoph Avatar answered Oct 16 '22 01:10

Christoph


EDIT: casting to long is a cheap way to protect oneself against the most likely possibility of int and pointers being different sizes nowadays.

As pointed out in the comments below, there are better solutions if you are willing to include a header...

A pointer p is aligned on a 16-byte boundary iff ((unsigned long)p & 15) == 0.

like image 38
Pascal Cuoq Avatar answered Oct 16 '22 01:10

Pascal Cuoq