In my project we have a piece of code like this:
// raw data consists of 4 ints unsigned char data[16]; int i1, i2, i3, i4; i1 = *((int*)data); i2 = *((int*)(data + 4)); i3 = *((int*)(data + 8)); i4 = *((int*)(data + 12));
I talked to my tech lead that this code may not be portable since it's trying to cast a unsigned char*
to a int*
which usually has a more strict alignment requirement. But tech lead says that's all right, most compilers remains the same pointer value after casting, and I can just write the code like this.
To be frank, I'm not really convinced. After researching, I find some people against use of pointer castings like above, e.g., here and here.
So here are my questions:
reinterpret_cast
?A pointer is an arrow that points to an address in memory, with a label indicating the type of the value. The address indicates where to look and the type indicates what to take. Casting the pointer changes the label on the arrow but not where the arrow points.
An aligned pointer is one that points to an address that's a multiple of the word size, and an unaligned pointer is one that's not pointing to an address that's a multiple of the word size. On most architectures, reading or writing unaligned pointers suffers some sort of penalty.
You can cast a pointer to another pointer of the same IBM® i pointer type. Note: If the ILE C compiler detects a type mismatch in an expression, a compile time error occurs. An open (void) pointer can hold a pointer of any type.
1. Is it REALLY safe to dereference the pointer after casting in a real project?
If the pointer happens to not be aligned properly it really can cause problems. I've personally seen and fixed bus errors in real, production code caused by casting a char*
to a more strictly aligned type. Even if you don't get an obvious error you can have less obvious issues like slower performance. Strictly following the standard to avoid UB is a good idea even if you don't immediately see any problems. (And one rule the code is breaking is the strict aliasing rule, § 3.10/10*)
A better alternative is to use std::memcpy()
or std::memmove
if the buffers overlap (or better yet bit_cast<>()
)
unsigned char data[16]; int i1, i2, i3, i4; std::memcpy(&i1, data , sizeof(int)); std::memcpy(&i2, data + 4, sizeof(int)); std::memcpy(&i3, data + 8, sizeof(int)); std::memcpy(&i4, data + 12, sizeof(int));
Some compilers work harder than others to make sure char arrays are aligned more strictly than necessary because programmers so often get this wrong though.
#include <cstdint> #include <typeinfo> #include <iostream> template<typename T> void check_aligned(void *p) { std::cout << p << " is " << (0==(reinterpret_cast<std::intptr_t>(p) % alignof(T))?"":"NOT ") << "aligned for the type " << typeid(T).name() << '\n'; } void foo1() { char a; char b[sizeof (int)]; check_aligned<int>(b); // unaligned in clang } struct S { char a; char b[sizeof(int)]; }; void foo2() { S s; check_aligned<int>(s.b); // unaligned in clang and msvc } S s; void foo3() { check_aligned<int>(s.b); // unaligned in clang, msvc, and gcc } int main() { foo1(); foo2(); foo3(); }
http://ideone.com/FFWCjf
2. Is there any difference between C-style casting and reinterpret_cast?
It depends. C-style casts do different things depending on the types involved. C-style casting between pointer types will result in the same thing as a reinterpret_cast; See § 5.4 Explicit type conversion (cast notation) and § 5.2.9-11.
3. Is there any difference between C and C++?
There shouldn't be as long as you're dealing with types that are legal in C.
* Another issue is that C++ does not specify the result of casting from one pointer type to a type with stricter alignment requirements. This is to support platforms where unaligned pointers cannot even be represented. However typical platforms today can represent unaligned pointers and compilers specify the results of such a cast to be what you would expect. As such, this issue is secondary to the aliasing violation. See [expr.reinterpret.cast]/7.
It's not alright, really. The alignment may be wrong, and the code may violate strict aliasing. You should unpack it explicitly.
i1 = data[0] | data[1] << 8 | data[2] << 16 | data[3] << 24;
etc. This is definitely well-defined behaviour, and as a bonus, it's also endianness-independent, unlike your pointer cast.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With