Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the (signed) cast necessary only sometimes to represent the minimum integer (32 bit system)?

Tags:

c

I recently found this code:

    for (i = 0; i < k; i ++) {
        b[i] = 0x80000000;  // min integer
        s[i] = 0;
    }

the above is part of this program:

int _max(int a, int b) {
    return a > b ? a : b;
}
int all_profits(int* prices, int pricesSize) {
    int i, d, p;

    p = 0;
    for (i = 1; i < pricesSize; i ++) {
        d = prices[i] - prices[i - 1];
        p = d > 0 ? p + d : p;  // get it as long as it is a profit!
    }

    return p;
}
int maxProfit(int k, int* prices, int pricesSize) {
    int *b, *s, *buff, i, j, p;

    if (pricesSize < 2) return 0;

    if (k >= pricesSize / 2) return all_profits(prices, pricesSize);

    buff = malloc((2 * k + 1) * sizeof(int));
    //assert(buff);
    b = &buff[0];
    s = &buff[k];

    for (i = 0; i < k; i ++) {
        b[i] = 0x80000000;  // min integer
        s[i] = 0;
    }
    s[k] = 0;

    for (i = 0; i < pricesSize; i ++) {
        for (j = 0; j < k; j ++) {
            // profit on buy is current buy or last sale minus today's price
            b[j]     = _max(b[j],     s[j] - prices[i]);
            // profit on sale is current sale or last buy plus today's price
            s[j + 1] = _max(s[j + 1], b[j] + prices[i]);
        }
    }

    p = s[k];

    free(buff);

    return p;
}

where 0x80000000 really is treated as the minimum integer.

But then the same author does this in another problem to prevent integer overflow:

int reverse(int x) {
    int d, k = 0;
    // 2147483647
    //-2147483648
    while (x) {
        d = x % 10;
        if ((x > 0 && k > (0x7fffffff - d) / 10) ||
            (x < 0 && k < ((signed)0x80000000 - d) / 10)) {
            return 0;   // overflow
        }
        k = k * 10 + d;
        x = x / 10;
    }
    return k; //(int)k == k ? (int)k : 0;
}

(the above program reverses an integer.)

Here, the minimum is (signed)0x80000000. This hexadecimal is basically -2^31 from my understanding, the (signed) represents the minus sign.

The same author does this, so there must be a reason to it. Can anyone explain?

like image 674
herophant Avatar asked Jan 26 '23 00:01

herophant


2 Answers

0x80000000 is of course hexadecimal for the number 2,147,483,648, and this is the value it has in source code.

When int is 32 bits, it cannot represent 2,147,483,468, so 0x80000000 cannot be an int value. The C standard says that, instead of int, the type of 0x80000000 is unsigned int. (This is different for constants written in hexadecimal than in decimal. For 2147483648, the type would be long int instead of unsigned int—a decimal constant is the first signed integer type it fits in, starting with int, but a hexadecimal constant is the first integer type it fits in, either signed or unsigned.)

So, in b[i] = 0x80000000;, we are assigning an unsigned int to an int. Assignment converts the value on the right to the type on the left. However, the value on the right cannot be represented in an int. In this case, the C standard says the result is implementation-defined. The binary value for 0x80000000 is of course 10000000000000000000000000000000. In the two’s complement system for representing signed 32-bit integers, the bit pattern for the smallest representable value, −231, is also 10000000000000000000000000000000. Thus, the author of this code is relying on the conversion to produce the int value that has the same bit pattern as the unsigned value 0x80000000.

As this is not guaranteed by the C standard, this code is not portable. The author might better have used INT_MIN instead of 0x80000000 or simply (-2147483647-1).

In ((signed)0x80000000 - d) / 10, consider what would happen if we instead wrote (0x80000000 - d) / 10). Since 0x80000000 is an unsigned int, the subtraction would be evaluated by converting d to an unsigned int and dividing by 10. For example, if d were −1, it would be converted to 0xFFFFFFFF (because conversion to unsigned int is defined by the C standard to wrap), and 0x80000000 - 0xFFFFFFFF would produce 0x80000001, which is 2,147,483,649, and division by 10 would produce 214,748,364. However, since the author cast to signed, meaning signed int or int, the (signed)0x80000000 evaluates to an int value of −2,147,483,648, and the subtraction is evaluated with int arithmetic. Then we have -2147483648 - -1, which produces −2,147,483,647, and division by 10 produces −214,748,364.

So, in the assignment, implicit conversion produces the result the author desires (relying on implementation-defined behavior). In the arithmetic expression, there is no implicit conversion, so the author had to insert one explicitly. This is bad design for several reasons:

  • It needlessly relies on implementation-defined behavior.
  • It uses subtleties of C semantics that can easily go wrong when somebody modifies the expressions or writes new ones that attempt to use 0x80000000 as if it were the minimum int value.
  • It does not document what it is doing or why.
like image 53
Eric Postpischil Avatar answered Feb 16 '23 03:02

Eric Postpischil


If we assume a compiler with 32 bit int, the constant 0x80000000 is out of range for that type. Therefore, it gets the type unsigned int. If we cast it to signed (which is a shorthand for signed int, which is a long-hand for just int) we then invoke an implementation-defined conversion that the author of the code is relying upon to produce the most negative int value.

like image 37
Kaz Avatar answered Feb 16 '23 04:02

Kaz