This question is not about the definition of unaligned data accesses, but why memcpy
silences the UBsan findings whereas type casting does not, despite generating the same assembly code.
I have some example code to parse a protocol that sends a byte array segmented into groups of six bytes.
void f(u8 *ba) {
// I know this array's length is a multiple of 6
u8 *p = ba;
u32 a = *(u32 *)p;
printf("a = %d\n", a);
p += 4;
u16 b = *(u16 *)p;
printf("b = %d\n", b);
p += 2;
a = *(u32 *)p;
printf("a = %d\n", a);
p += 4;
b = *(u16 *)p;
printf("b = %d\n", b);
}
After incrementing my pointer by 6 and doing another 32-bit read, the UBSan reports an error about a misaligned load. I suppress this error using memcpy
instead of type-punning, but I don't have a good understanding why. To be clear, here is the same routine without UBSan errors,
void f(u8 *ba) {
// I know this array's length is a multiple of 6 (
u8 *p = ba;
u32 a;
memcpy(&a, p, 4);
printf("a = %d\n", a);
p += 4;
memcpy(&b, p, 2);
printf("b = %d\n", b);
p += 2;
memcpy(&a, p, 4);
printf("a = %d\n", a);
p += 4;
memcpy(&b, p, 2);
printf("b = %d\n", b);
}
Both routines compile to identical assembly code (using movl
for the 32-bit read and movzwl
for the 16-bit read), so why is one undefined behaviour when the other is not? Does memcpy
have some special properties that guarantee something?
I don't want to use memcpy
here because I can't rely on compilers doing a good enough job optimising it.
UB sanitizer is used to detect that the code is not strictly-conforming and depends, in fact, on undefined behaviour that is not guaranteed.
Actually the C standard says that the behaviour is undefined as soon as you cast a pointer to a type for which the address is not suitably aligned. C11 (draft, n1570) 6.3.2.3p7:
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned 68) for the referenced type, the behavior is undefined.
I.e.
u8 *p = ba;
u32 *a = (u32 *)p; // undefined behaviour if misaligned. No dereference required
The presence of this cast allows a compiler to presume that ba
was aligned to 4-byte boundary (on a platform where u32
is required to be thus aligned, which many compilers will do on x86), after which it can generate code that assumes the alignment.
Even on x86 platform, there are instructions that fail spectacularly: innocent-looking code can be compiled into machine code that will cause an abort at runtime. UBSan is supposed to catch this in code that would otherwise look sane and behave "as expected" when you run it, but then fail if compiled with another set of options or different optimization level.
The compiler can generate the correct code for memcpy
- and often will, but it is just because the compiler will know that the unaligned access would work and perform well enough on the target platform.
Lastly:
I don't want to use
memcpy
here because I can't rely on compilers doing a good enough job optimising it.
What you're saying here is: "I want my code to work reliably only whenever compiled by garbage or two-decades-old compilers that generate slow code. Definitely not when compiled with the ones that could optimize it to run fast."
The original type of your object would best be u32
, an array of u32
... Otherwise, you're handling this sensibly by using memcpy
. This isn't likely to be a significant bottleneck on modern systems; I wouldn't worry about that.
On some platforms, an integer can't exist at every possible address. Consider the maximum address for your system, we could just postulate upon 0xFFFFFFFFFFFFFFFF
. A four-byte integer couldn't possibly exist here, right?
Sometimes optimisations are performed at the hardware to align the bus (the series of wires leading from the CPU to various peripherals, memory and what-not) based on this, and one of those is to assume addresses for various types only occur in multiples of their sizes, for example. A misaligned access on such a platform is likely to cause a trap (segfault).
Hence, UBSan is correctly warning you about this non-portable and difficult to debug problem.
Not only does this issue cause some systems to fail to work entirely, but you'll find your system which permits you to access out of alignment requires a second fetch across the bus to retrieve the second portion of the integer, anyway.
There are a few other problems in this code.
printf("a = %d\n", a);
If you wish to print an int
, you should use %d
. However, your argument is a u32
.Don't mismatch your arguments like this; that's also undefined behaviour. I don't know for certain how u32
is defined for you, but I guess the closest standard-compliant feature is probably uint32_t
(from <stdint.h>
). You should use "%"PRIu32
as your format string in any place you want to print a uint32_t
. The PRIu32
(from <inttypes.h>
) symbol provides an implementation-defined sequence of characters that will be recognised by the implementations printf
function.
Note that this problem is repeated elsewhere, where you're using the u16
type instead:
printf("b = %d\n", b);
"%"PRIu16
will probably suffice there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With