Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Emulating signed integers using unsigned integers in C

In Jens Gustedt's book Modern C, on page 59, he explains how signed integers can be emulated using unsigned integers. His example code shows how one can implement a comparison of two unsigned integers reinterpreted as signed integers:

bool is_negative(unsigned a) { 
   unsigned const int_max = UINT_MAX /2;
   return a > int_max;
}

bool is_signed_less(unsigned a, unsigned b) {
   if (is_negative(b) && !is_negative(a)) return false;
   else return a < b; 
} 

Do I misunderstand something here or does he miss the second special case where is_negative(a) = true and is_negative(b) = false?

For example if we want to have a = -1 and b = 1, then, using two's complement, we would represent them as

unsigned int a = UINT_MAX; 
unsigned int b = 1;    

(e.g. for a 4 bit integer we would have a = 1111 and b = 0001). Now we have is_negative(a) returns true, and is_negative(b) returns false. When calling is_signed_less(a, b) we end up in the else clause and a < b (now interpreted as unsigned integers) will return false. However, it is clearly true that -1 < 1, so the function returns the wrong result.

Is this a typo in the code of the book or is there something that I do not understand?

like image 684
JezuzStardust Avatar asked Sep 13 '25 16:09

JezuzStardust


2 Answers

This is what happens when people try to be "smart" instead of following "keep it simple, stupid" best practices. Good engineering involves writing the code as simple as you possibly can, for example:

bool is_signed_less_correct (unsigned a, unsigned b) {
  bool is_neg_a = is_negative(a);
  bool is_neg_b = is_negative(b);

  if(is_neg_a != is_neg_b) // one is negative
  {
    return is_neg_a; // if one is negative and it is a, return true otherwise false
  } 

  // both are negative or both are positive
  return a < b;
}

Even this code a bit too "smart" still, since it implicitly uses the fact that -1 == 0xFFFF... is the largest 2's complement signed number and therefore a < b holds true no matter if these are negative or not, as long as they are both of the same signedness.

Then of course you would always write a little unit test to sanity check it: https://godbolt.org/z/h4nKsffqr

Output:

-2 < -1 ? true  (is_signed_less_gustedt)
-1 < -1 ? false (is_signed_less_gustedt)
-1 <  0 ? false (is_signed_less_gustedt)
 0 < -1 ? false (is_signed_less_gustedt)
 0 <  1 ? true  (is_signed_less_gustedt)
 1 <  0 ? false (is_signed_less_gustedt)
 1 <  1 ? false (is_signed_less_gustedt)

-2 < -1 ? true  (is_signed_less_correct)
-1 < -1 ? false (is_signed_less_correct)
-1 <  0 ? true  (is_signed_less_correct)
 0 < -1 ? false (is_signed_less_correct)
 0 <  1 ? true  (is_signed_less_correct)
 1 <  0 ? false (is_signed_less_correct)
 1 <  1 ? false (is_signed_less_correct)
like image 125
Lundin Avatar answered Sep 16 '25 06:09

Lundin


As already suggested by Eric Postpischil in the comments above, this is surely simpler and more efficient than the book version (or any of the other answers posted so far, AFAICT):

bool is_signed_less(unsigned a, unsigned b) {
   unsigned const int_min = UINT_MAX / 2 + 1; /* 0x80000000 on 32-bit platforms */
   return (a - int_min) < (b - int_min); 
}

This code simply subtracts an offset from both of the inputs, so that the number representing the lowest possible signed integer value (int_min) is mapped to the lowest possible unsigned integer (0), and then does an unsigned comparison.

In particular, the subtraction maps the most negative signed integer (int_min) to 0, the second most negative signed integer (int_min + 1) to 1, the third most negative (int_min + 2) to 2, and so on. Meanwhile, non-negative signed integers (whose bit patterns correspond to unsigned integers less than int_min) will cause the subtraction to wrap around, yielding values that are higher than those corresponding to any negative input.


Note that there are several possible (and equivalent) variations of this code. For example, a curious numeric quirk of how fixed-width binary arithmetic works means that int_min is the unique unsigned integer for which x + int_min, x - int_min and x ^ int_min are all equal (for any x) — all of these operations just flip the most significant bit of x. Thus, in the code above, the - operators can be replaced by either + or ^ without changing the behavior of the code in any way!

(There could theoretically be a small performance difference, if some of these operations happened to run faster or pipeline better on your CPU, and if your compiler wasn't smart enough to optimize them all into the same assembly code. If you're worried that might be the case, benchmark them all and find out. Most likely they'll all be equally fast, though. Also, if you really want to obfuscate the code and confuse readers, try mixing and matching the operators into something like return (a ^ int_min) < (int_min + b). :P)

like image 25
Ilmari Karonen Avatar answered Sep 16 '25 07:09

Ilmari Karonen