Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

static_cast wchar_t* to int* or short* - why is it illegal?

In both Microsoft VC2005 and g++ compilers, the following results in an error:

On win32 VC2005: sizeof(wchar_t) is 2

wchar_t *foo = 0;
static_cast<unsigned short *>(foo);

Results in

error C2440: 'static_cast' : cannot convert from 'wchar_t *' to 'unsigned short *' ...

On Mac OS X or Linux g++: sizeof(wchar_t) is 4

wchar_t *foo = 0;
static_cast<unsigned int *>(foo);

Results in

error: invalid static_cast from type 'wchar_t*' to type 'unsigned int*'

Of course, I can always use reinterpret_cast. However, I would like to understand why it is deemed illegal by the compiler to static_cast to the appropriate integer type. I'm sure there is a good reason...

like image 276
VoidPointer Avatar asked Dec 14 '22 03:12

VoidPointer


2 Answers

You cannot cast between unrelated pointer types. The size of the type pointed to is irrelevant. Consider the case where the types have different alignment requirements, allowing a cast like this could generate illegal code on some processesors. It is also possible for pointers to different types to have differrent sizes. This could result in the pointer you obtain being invalid and or pointing at an entirely different location. Reinterpret_cast is one of the escape hatches you hacve if you know for your program compiler arch and os you can get away with it.

like image 165
Logan Capaldo Avatar answered Dec 27 '22 08:12

Logan Capaldo


As with char, the signedness of wchar_t is not defined by the standard. Put this together with the possibility of non-2's complement integers, and for for a wchar_t value c,

*reinterpret_cast<unsigned short *>(&c)

may not equal:

static_cast<unsigned short>(c)

In the second case, on implementations where wchar_t is a sign+magnitude or 1's complement type, any negative value of c is converted to unsigned using modulo 2^N, which changes the bits. In the former case the bit pattern is picked up and used as-is (if it works at all).

Now, if the results are different, then there's no realistic way for the implementation to provide a static_cast between the pointer types. What could it do, set a flag on the unsigned short* pointer, saying "by the way, when you load from this, you have to also do a sign conversion", and then check this flag on all unsigned short loads?

That's why it's not, in general, safe to cast between pointers to distinct integer types, and I believe this unsafety is why there is no conversion via static_cast between them.

If the type you're casting to happens to be the so-called "underlying type" of wchar_t, then the resulting code would almost certainly be OK for the implementation, but would not be portable. So the standard doesn't offer a special case allowing you a static_cast just for that type, presumably because it would conceal errors in portable code. If you know reinterpret_cast is safe, then you can just use it. Admittedly, it would be nice to have a straightforward way of asserting at compile time that it is safe, but as far as the standard is concerned you should design around it, since the implementation is not required even to dereference a reinterpret_casted pointer without crashing.

like image 33
Steve Jessop Avatar answered Dec 27 '22 10:12

Steve Jessop