Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Portable reinterpretation of uint8_t as int8_t and forcing two's complement

Tags:

c

casting

stdint

I am trying to reinterpret a uint8_t as an int8_t (and back again) in a way that is portable. I'm as I am receiving over a serial channel that I store in a buffer of uint8_t, but once I know what kind of packet it is, I need to interpret some of the bytes as two's compliment and others as unsigned.

I know that this will work on many compilers:

int8_t i8;
uint8_t u8 = 0x94;

i8 = (int8_t)u8;

But it is not guaranteed to work when u8>127 because casting a value greater than INT8_MAX to int8_t is undefined (I think).

The best I have been able to come up with is this

int8_t i8;
uint8_t u8;

i8 = (u8 > INT8_MAX) ? (int8_t)(-(256-u8)):(int8_t)u8;

This should always work because subtraction always will cause an automatic promotion to at int, and in no way relies on the underlying representations. It implicitly forces a two's complement interpretation of values greater than INT8_MAX.

Is there a better way (or a standard MACRO) to do this?

like image 607
Chris Avatar asked Feb 07 '19 21:02

Chris


2 Answers

If int8_t is defined (by <stdint.h>), it is guaranteed to be two’s complement (by C 2018 7.20.1.1).

The value in uint8_t u8 can be reinterpreted as a two’s complement value by copying it to int8_t i8 with memcpy(&i8, &u8, sizeof i8);. (Good compilers will optimize this to simply use the u8 as a two’s complement value, with no call to memcpy.)

like image 191
Eric Postpischil Avatar answered Nov 05 '22 17:11

Eric Postpischil


In eight-bit two's complement, the sign bit can be interpreted as having the place value -28, which is of course -256. This is in fact precisely how the C standard characterizes it. Therefore, given an 8-bit value stored in a uint8_t that you want to reinterpret as a two's complement integer, this is an arithmetic way to do so:

uint8_t u8 = /* ... */;
int8_t  i8 = (u8 & 0x7f) - (u8 > 0x7f) * 0x100;

Note that all of the arithmetic is performed by first promoting the operands to (signed) int, so there is neither overflow (because the range of int is large enough for this) nor unsigned arithmetic wrap-around. The arithmetic result is guaranteed to be in the range of int8_t, so there is no risk of overflow in the conversion of the result to that type, either.

You will note similarities between this computation and yours, but this one avoids the ternary operator by using the result of the relational expression u8 > 0x7f (either 0 or 1) directly in the arithmetic, thus avoiding any branching, and it dispenses with needless casts. (Yours doesn't need the casts, either.)

Note also that if you run into some weird implementation that does not provide int8_t (because its chars are wider than 8 bits, or its signed chars do not use two's complement) then that arithmetic approach still works in the sense of computing the right value, and you can be certain of safely recording that value in an int orshort. Thus, the absolutely most portable way to extract the value of 8-bit two's complement interpretation of a uint8_t would be

uint8_t u8 = /* ... */;
int i8 = (u8 & 0x7f) - (u8 > 0x7f) * 0x100;

Alternatively, if you are willing to rely on int8_t to be a character type -- i.e. an alias for char or signed char -- then it is perfectly standard to do the job this way:

uint8_t u8 = /* ... */;
int8_t  i8 = *(int8_t *)&u8;

That one is even more likely to be optimized away by a compiler than is the memcpy() alternative presented in another answer, but unlike the memcpy alternative, this one formally has undefined behavior if int8_t turns out not to be a character type. On the other hand, both this and the memcpy() approach depend on the implementation to provide type int8_t, and even more unlikely than an implementation not providing int8_t is that an implementation provides an int8_t that fails to be a character type.

like image 2
John Bollinger Avatar answered Nov 05 '22 18:11

John Bollinger