Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bitwise operation results in unexpected variable size

Context

We are porting C code that was originally compiled using an 8-bit C compiler for the PIC microcontroller. A common idiom that was used in order to prevent unsigned global variables (for example, error counters) from rolling over back to zero is the following:

if(~counter) counter++; 

The bitwise operator here inverts all the bits and the statement is only true if counter is less than the maximum value. Importantly, this works regardless of the variable size.

Problem

We are now targeting a 32-bit ARM processor using GCC. We've noticed that the same code produces different results. So far as we can tell, it looks like the bitwise complement operation returns a value that is a different size than we would expect. To reproduce this, we compile, in GCC:

uint8_t i = 0; int sz;  sz = sizeof(i); printf("Size of variable: %d\n", sz); // Size of variable: 1  sz = sizeof(~i); printf("Size of result: %d\n", sz); // Size of result: 4 

In the first line of output, we get what we would expect: i is 1 byte. However, the bitwise complement of i is actually four bytes which causes a problem because comparisons with this now will not give the expected results. For example, if doing (where i is a properly-initialized uint8_t):

if(~i) i++; 

we will see i "wrap around" from 0xFF back to 0x00. This behaviour is different in GCC compared with when it used to work as we intended in the previous compiler and 8-bit PIC microcontroller.

We are aware that we can resolve this by casting like so:

if((uint8_t)~i) i++; 

or, by

if(i < 0xFF) i++; 

however in both of these workarounds, the size of the variable must be known and is error-prone for the software developer. These kinds of upper bounds checks occur throughout the codebase. There are multiple sizes of variables (eg., uint16_t and unsigned char etc.) and changing these in an otherwise working codebase is not something we're looking forward to.

Question

Is our understanding of the problem correct, and are there options available to resolving this that do not require re-visiting each case where we've used this idiom? Is our assumption correct, that an operation like bitwise complement should return a result that is the same size as the operand? It seems like this would break, depending on processor architectures. I feel like I'm taking crazy pills and that C should be a bit more portable than this. Again, our understanding of this could be wrong.

On the surface this might not seem like a huge issue but this previously-working idiom is used in hundreds of locations and we're eager to understand this before proceeding with expensive changes.


Note: There is a seemingly similar but not exact duplicate question here: Bitwise operation on char gives 32 bit result

I didn't see the actual crux of the issue discussed there, namely, the result size of a bitwise complement being different than what's passed into the operator.

like image 398
Charlie Salts Avatar asked Apr 15 '20 15:04

Charlie Salts


2 Answers

What you are seeing is the result of integer promotions. In most cases where an integer value is used in an expression, if the type of the value is smaller than int the value is promoted to int. This is documented in section 6.3.1.1p2 of the C standard:

The following may be used in an expression wherever an intor unsigned int may be used

  • An object or expression with an integer type (other than intor unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
  • A bit-field of type _Bool, int ,signed int, orunsigned int`.

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

So if a variable has type uint8_t and the value 255, using any operator other than a cast or assignment on it will first convert it to type int with the value 255 before performing the operation. This is why sizeof(~i) gives you 4 instead of 1.

Section 6.5.3.3 describes that integer promotions apply to the ~ operator:

The result of the ~ operator is the bitwise complement of its (promoted) operand (that is, each bit in the result is set if and only if the corresponding bit in the converted operand is not set). The integer promotions are performed on the operand, and the result has the promoted type. If the promoted type is an unsigned type, the expression ~E is equivalent to the maximum value representable in that type minus E.

So assuming a 32 bit int, if counter has the 8 bit value 0xff it is converted to the 32 bit value 0x000000ff, and applying ~ to it gives you 0xffffff00.

Probably the simplest way to handle this is without having to know the type is to check if the value is 0 after incrementing, and if so decrement it.

if (!++counter) counter--; 

The wraparound of unsigned integers works in both directions, so decrementing a value of 0 gives you the largest positive value.

like image 159
dbush Avatar answered Sep 28 '22 04:09

dbush


in sizeof(i); you request the size of the variable i, so 1

in sizeof(~i); you request the size of the type of the expression, which is an int, in your case 4


To use

if(~i)

to know if i does not value 255 (in your case with an the uint8_t) is not very readable, just do

if (i != 255) 

and you will have a portable and readable code


There are multiple sizes of variables (eg., uint16_t and unsigned char etc.)

To manage any size of unsigned :

if (i != (((uintmax_t) 2 << (sizeof(i)*CHAR_BIT-1)) - 1)) 

The expression is constant, so computed at compile time.

#include <limits.h> for CHAR_BIT and #include <stdint.h> for uintmax_t

like image 42
bruno Avatar answered Sep 28 '22 04:09

bruno