Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Valid uses cases for reinterpret_cast for unaligned memory access vs memcpy?

In the internals of snappy, there is a conditionally compiled section that selects dereferencing a reinterpret_cast'ed pointer as the best implementation for reads and writes of potentially unaligned 16, 32, and 64 bit integers on architectures that are known to support such operations (like x86). The fallback for other architectures is to use a memcpy based implementation.

My understanding is that the reinterpret_cast implementation exhibits undefined behavior, and clang's undefined behavior sanitizer does flag it.

What is puzzling me though is: why not just use the memcpy based implementation? I would expect all but the most broken of compilers to use intrinsics to implement these memcpy calls, since the size is known at compile time. In fact I would expect identical codegen from both implementations on any modern toolchain.

However, I also recognize that snappy was written by folks who know what they are about. So this leaves me wondering whether there is still some advantage to using the reinterpret_cast mechanism that outweighs its being undefined behavior. Not wanting performance to depend on compiler quality of implementation? Something else I haven't considered?

like image 982
acm Avatar asked Oct 20 '22 14:10

acm


1 Answers

Without knowing the programmer(s) who wrote that code in the first place, I doubt you can get a truly authoritative answer.

Here's my best guess: the authors didn't want to rely on a possible memcpy optimization (which is in no way guaranteed by the spec, even if it is implemented by many compilers). On the flip side, writing a reinterpret_cast is very, very likely to produce simply the unaligned access instruction that the authors were expecting, on practically any compiler.

While smart, modern compilers will optimize the memcpy, older ones may not. Consistent performance can be quite critical to this library, so they seem to have sacrificed some correctness (since the reinterpret_cast appears to be potentially UB) in favour of obtaining more consistent results across a wider set of compilers.

like image 99
nneonneo Avatar answered Oct 23 '22 03:10

nneonneo