Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the correct way to convert 2 bytes to a signed 16-bit integer?

In this answer, zwol made this claim:

The correct way to convert two bytes of data from an external source into a 16-bit signed integer is with helper functions like this:

#include <stdint.h>  int16_t be16_to_cpu_signed(const uint8_t data[static 2]) {     uint32_t val = (((uint32_t)data[0]) << 8) |                     (((uint32_t)data[1]) << 0);     return ((int32_t) val) - 0x10000u; }  int16_t le16_to_cpu_signed(const uint8_t data[static 2]) {     uint32_t val = (((uint32_t)data[0]) << 0) |                     (((uint32_t)data[1]) << 8);     return ((int32_t) val) - 0x10000u; } 

Which of the above functions is appropriate depends on whether the array contains a little endian or a big endian representation. Endianness is not the issue at question here, I am wondering why zwol subtracts 0x10000u from the uint32_t value converted to int32_t.

Why is this the correct way?

How does it avoid the implementation defined behavior when converting to the return type?

Since you can assume 2's complement representation, how would this simpler cast fail: return (uint16_t)val;

What is wrong with this naive solution:

int16_t le16_to_cpu_signed(const uint8_t data[static 2]) {     return (uint16_t)data[0] | ((uint16_t)data[1] << 8); } 
like image 607
chqrlie Avatar asked Mar 26 '20 09:03

chqrlie


People also ask

What is a 16-bit signed integer?

Signed Integer: A 16-bit signed integer ranging from -32,768 to +32,767.

What is the value of 16-bit unsigned integer?

A 16-bit integer can store 216 (or 65,536) distinct values. In an unsigned representation, these values are the integers between 0 and 65,535; using two's complement, possible values range from −32,768 to 32,767.

What is the 16-bit integer limit?

For an unsigned short, all 16 bits are used to represent the value, so the largest representable number is 216 − 1 = 65,535.

What is the largest 16-bit signed number?

There are also 65,536 different signed 16-bit numbers. The smallest signed 16-bit number is -32768 and the largest is 32767.

What is the conversion between bytes and bits?

The data storage d in byte (B) is equal to the data storage d in bit (b) times 0.125, that conversion formula: d (B) = d (b) × 0.125 How many Byte in a Bit?

Can I combine two bytes to make a 32-bit number?

You were nearly there. combining two bytes (uint8) does not make a 32-bit number (uint32), but a 16-bit numbers (uint16), so: You haven't specified the endianness of your data, nor that of your machine. If the two don't match, you'll have to swapbytes afterward: I don't know which element of x is the upper or lower byte, so I did it both ways.

How to get unsigned 16 bit value from unsigned data?

computing the unsigned 16 bit value with (unsigned)data | ((unsigned)data << 8) (for the little endian version) compiles to a single instruction and yields an unsigned 16-bit value.

What if int16_t is 16 bit?

If int is 16-bit then your version relies on implementation-defined behaviour if the value of the expression in the return statement is out of range for int16_t.


2 Answers

If int is 16-bit then your version relies on implementation-defined behaviour if the value of the expression in the return statement is out of range for int16_t.

However the first version also has a similar problem; for example if int32_t is a typedef for int, and the input bytes are both 0xFF, then the result of the subtraction in the return statement is UINT_MAX which causes implementation-defined behaviour when converted to int16_t.

IMHO the answer you link to has several major issues .

like image 109
M.M Avatar answered Sep 21 '22 02:09

M.M


This should be pedantically correct and work also on platforms that use sign bit or 1's complement representations, instead of the usual 2's complement. The input bytes are assumed to be in 2's complement.

int le16_to_cpu_signed(const uint8_t data[static 2]) {     unsigned value = data[0] | ((unsigned)data[1] << 8);     if (value & 0x8000)         return -(int)(~value) - 1;     else         return value; } 

Because of the branch, it will be more expensive than other options.

What this accomplishes is that it avoids any assumption on how int representation relates to unsigned representation on the platform. The cast to int is required to preserve arithmetic value for any number that will fit in target type. Because the inversion ensures top bit of 16-bit number will be zero, the value will fit. Then the unary - and subtraction of 1 apply the usual rule for 2's complement negation. Depending on platform, INT16_MIN could still overflow if it doesn't fit in the int type on the target, in which case long should be used.

The difference to the original version in the question comes at the return time. While the original just always subtracted 0x10000 and 2's complement let signed overflow wrap it to int16_t range, this version has the explicit if that avoids signed wrapover (which is undefined).

Now in practice, almost all platforms in use today use 2's complement representation. In fact, if the platform has standard-compliant stdint.h that defines int32_t, it must use 2's complement for it. Where this approach sometimes comes handy is with some scripting languages that don't have integer data types at all - you can modify the operations shown above for floats and it will give the correct result.

like image 41
jpa Avatar answered Sep 18 '22 02:09

jpa