Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

c = a + b and implicit conversion

With my compiler, c is 54464 (16 bits truncated) and d is 10176. But with gcc, c is 120000 and d is 600000.

What is the true behavior? Is the behavior undefined? Or is my compiler false?

unsigned short a = 60000;
unsigned short b = 60000;
unsigned long c = a + b;
unsigned long d = a * 10;

Is there an option to alert on these cases?

Wconversion warns on:

void foo(unsigned long a);
foo(a+b);

but doesn't warn on:

unsigned long c = a + b
like image 936
VTiTux Avatar asked Jul 27 '15 12:07

VTiTux


2 Answers

First, you should know that in C the standard types do not have a specific precision (number of representable values) for the standard integer types. It only requires a minimal precision for each type. These result in the following typical bit sizes, the standard allows for more complex representations:

  • char: 8 bits
  • short: 16 bits
  • int: 16 (!) bits
  • long: 32 bits
  • long long (since C99): 64 bits

Note: The actual limits (which imply a certain precision) of an implementation are given in limits.h.

Second, the type an operation is performed is determined by the types of the operands, not the type of the left side of an assignment (becaus assignments are also just expressions). For this the types given above are sorted by conversion rank. Operands with smaller rank than int are converted to int first. For other operands, the one with smaller rank is converted to the type of the other operand. These are the usual arithmetic conversions.

Your implementation seems to use 16 bit unsigned int with the same size as unsigned short, so a and b are converted to unsigned int, the operation is performed with 16 bit. For unsigned, the operation is performed modulo 65536 (2 to the power of 16) - this is called wrap-around (this is not required for signed types!). The result is then converted to unsigned long and assigned to the variables.

For gcc, I assume this compiles for a PC or a 32 bit CPU. for this(unsigned) int has typically 32 bits, while (unsigned) long has at least 32 bits (required). So, there is no wrap around for the operations.

Note: For the PC, the operands are converted to int, not unsigned int. This because int can already represent all values of unsigned short; unsigned int is not required. This can result in unexpected (actually: implementation defined) behaviour if the result of the operation overflows an signed int!

If you need types of defined size, see stdint.h (since C99) for uint16_t, uint32_t. These are typedefs to types with the appropriate size for your implementation.

You can also cast one of the operands (not the whole expression!) to the type of the result:

unsigned long c = (unsigned long)a + b;

or, using types of known size:

#include <stdint.h>
...
uint16_t a = 60000, b = 60000;
uint32_t c = (uint32_t)a + b;

Note that due to the conversion rules, casting one operand is sufficient.

Update (thanks to @chux):

The cast shown above works without problems. However, if a has a larger conversion rank than the typecast, this might truncate its value to the smaller type. While this can be easily avoided as all types are known at compile-time (static typing), an alternative is to multiply with 1 of the wanted type:

unsigned long c = ((unsigned long)1U * a) + b

This way the larger rank of the type given in the cast or a (or b) is used. The multiplication will be eliminated by any reasonable compiler.

Another approach, avoiding to even know the target type name can be done with the typeof() gcc extension:

unsigned long c;

... many lines of code

c = ((typeof(c))1U * a) + b
like image 200
too honest for this site Avatar answered Oct 27 '22 11:10

too honest for this site


a + b will be computed as an unsigned int (the fact that it is assigned to an unsigned long is not relevant). The C standard mandates that this sum will wrap around modulo "one plus the largest unsigned possible". On your system, it looks like an unsigned int is 16 bit, so the result is computed modulo 65536.

On the other system, it looks like int and unsigned int are larger, and therefore capable of holding the larger numbers. What happens now is quite subtle (acknowledge @PascalCuoq): Beacuse all values of unsigned short are representable in int, a + b will be computed as an int. (Only if short and int are the same width or, in some other way, some values of unsigned short cannot be represented as int will the sum will be computed as unsigned int.)

Although the C standard does not specify a fixed size for either an unsigned short or an unsigned int, your program behaviour is well-defined. Note that this is not true for an signed type though.

As a final remark, you can use the sized types uint16_t, uint32_t etc. which, if supported by your compiler, are guaranteed to have the specified size.

like image 40
Bathsheba Avatar answered Oct 27 '22 13:10

Bathsheba