Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

No compliant way to convert signed/unsigned of same size

I fear I may be missing something trivial, but it appears there is no actual safe way to convert to/from a signed type if you wish to retain the original unsigned value.

On reinterpret_cast, 5.2.10 does not list an integer to integer conversion, thus it is not defined (and static_cast defines no additional conversion). On integral conversions 4.7.3 basically says conversion of a large unsigned will be implementation defined (thus not portable).

This seems limiting since we know, for example, that a uint64_t should, on any hardware, be safely convertible to a int64_t and back without change in value. Plus the rules on standard layout types actually guarantee safe conversion if we were to memcpy between the two types instead of assign.

Am I correct? Is there a legitimate reason why one cannot reinterpret_cast between integral types sufficient size?


Clarification: Definitely the signed version of the unsigned is not guaranteed a value, but it is only the round-trip that I am considering (unsigned => signed => unsigned)


UPDATE: Looking closely at the answers and cross-checking the standard, I believe the memcpy is not actually guaranteed to work, as nowhere does it state that the two types are layout compatible, and neither are char types. Further update, digging into the C-standard this memcpy should work, as the sizeof the target is large enough and it copies the bytes.


ANSWER: There appears to be no technical reason why reinterpret_cast was not allowed to perform this conversion. For these fixed size integer types a memcpy is guaranteed to work, and indeed so long as the intermediate can represent all bit-patterns any intermediate type can be used (float's can be dangerous as there may be trap patterns). In general you can't memcpy between any standard layout types, they must be compatible or char type. Here the ints are special since they have additional guarantees.

like image 622
edA-qa mort-ora-y Avatar asked Feb 27 '12 15:02

edA-qa mort-ora-y


3 Answers

We know that you can't cast an arbitrary bit sequence to floating-point, because it might be a trap representation.

Is there any rule that says there can't be trap representations in the signed integral types? (Unsigned types can't, because of the way the range is defined, all representations are needed for valid values)

Signed representations can also include equivalence classes (such as +0 == -0) and may coerce values in such a class to a canonical representation, thus breaking the roundtrip.

Here's the relevant rules from the Standard (sectin 4.7, [conv.integral]):

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note ]

If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

If you mean using reinterpret_cast on a pointer or reference, rather than the value, you have to deal with the strict-aliasing rule. And what you find is that this case is expressly allowed.

like image 100
Ben Voigt Avatar answered Oct 15 '22 01:10

Ben Voigt


As you point out, memcpy is safe:

uint64_t a = 1ull<<63;
int64_t b;
memcpy(&b,&a,sizeof a);

The value is b is still implementation defined since C++ does not require a two's complement representation, but converting it back will give you the original value.

As Bo Persson points out int64_t will be two's complement. Therefore the memcpy should result in a signed value for which the simple integral conversion back to the unsigned type is well defined to be the original unsigned value.

uint64_t c = b;
assert( a == c );

Also, you can implement your own 'signed_cast' to make conversions easy (I don't take advantage of the two's complement thing since these aren't limited to the intN_t types):

template<typename T>
typename std::enable_if<std::is_integral<T>::value && std::is_signed<T>::value,T>::type
signed_cast(typename std::make_unsigned<T>::type v) {
    T s;
    std::memcpy(&s,&v,sizeof v);
    return s;
}

template<typename T>
typename std::enable_if<std::is_integral<T>::value && std::is_unsigned<T>::value,T>::type
signed_cast(typename std::make_signed<T>::type v) {
    T s;
    std::memcpy(&s,&v,sizeof v);
    return s;
}
like image 34
bames53 Avatar answered Oct 15 '22 01:10

bames53


Presumably it's not allowed because for machines with sign-magnitude representations it would violate the principle of least surprise that signed 0 maps to unsigned 0 while a signed -0 would map to some other (probably very large) number.

Given that the memcpy solution exists I assume the standards body decided to not support such an unintuitive mapping, probably because unsigned->signed->unsigned isn't as useful a sequence as pointer->integer->pointer.

like image 31
Mark B Avatar answered Oct 15 '22 02:10

Mark B