Why is the (signed) cast necessary only sometimes to represent the minimum integer (32 bit system)?

4

I recently found this code:

    for (i = 0; i < k; i ++) {
        b[i] = 0x80000000;  // min integer
        s[i] = 0;
    }

the above is part of this program:

int _max(int a, int b) {
    return a > b ? a : b;
}
int all_profits(int* prices, int pricesSize) {
    int i, d, p;

    p = 0;
    for (i = 1; i < pricesSize; i ++) {
        d = prices[i] - prices[i - 1];
        p = d > 0 ? p + d : p;  // get it as long as it is a profit!
    }

    return p;
}
int maxProfit(int k, int* prices, int pricesSize) {
    int *b, *s, *buff, i, j, p;

    if (pricesSize < 2) return 0;

    if (k >= pricesSize / 2) return all_profits(prices, pricesSize);

    buff = malloc((2 * k + 1) * sizeof(int));
    //assert(buff);
    b = &buff[0];
    s = &buff[k];

    for (i = 0; i < k; i ++) {
        b[i] = 0x80000000;  // min integer
        s[i] = 0;
    }
    s[k] = 0;

    for (i = 0; i < pricesSize; i ++) {
        for (j = 0; j < k; j ++) {
            // profit on buy is current buy or last sale minus today's price
            b[j]     = _max(b[j],     s[j] - prices[i]);
            // profit on sale is current sale or last buy plus today's price
            s[j + 1] = _max(s[j + 1], b[j] + prices[i]);
        }
    }

    p = s[k];

    free(buff);

    return p;
}

where 0x80000000 really is treated as the minimum integer.

But then the same author does this in another problem to prevent integer overflow:

int reverse(int x) {
    int d, k = 0;
    // 2147483647
    //-2147483648
    while (x) {
        d = x % 10;
        if ((x > 0 && k > (0x7fffffff - d) / 10) ||
            (x < 0 && k < ((signed)0x80000000 - d) / 10)) {
            return 0;   // overflow
        }
        k = k * 10 + d;
        x = x / 10;
    }
    return k; //(int)k == k ? (int)k : 0;
}

(the above program reverses an integer.)

Here, the minimum is (signed)0x80000000. This hexadecimal is basically -2^31 from my understanding, the (signed) represents the minus sign.

The same author does this, so there must be a reason to it. Can anyone explain?

c
asked on Stack Overflow Aug 1, 2019 by herophant • edited Aug 1, 2019 by herophant

3 Answers

6

0x80000000 is of course hexadecimal for the number 2,147,483,648, and this is the value it has in source code.

When int is 32 bits, it cannot represent 2,147,483,468, so 0x80000000 cannot be an int value. The C standard says that, instead of int, the type of 0x80000000 is unsigned int. (This is different for constants written in hexadecimal than in decimal. For 2147483648, the type would be long int instead of unsigned int—a decimal constant is the first signed integer type it fits in, starting with int, but a hexadecimal constant is the first integer type it fits in, either signed or unsigned.)

So, in b[i] = 0x80000000;, we are assigning an unsigned int to an int. Assignment converts the value on the right to the type on the left. However, the value on the right cannot be represented in an int. In this case, the C standard says the result is implementation-defined. The binary value for 0x80000000 is of course 10000000000000000000000000000000. In the two’s complement system for representing signed 32-bit integers, the bit pattern for the smallest representable value, −231, is also 10000000000000000000000000000000. Thus, the author of this code is relying on the conversion to produce the int value that has the same bit pattern as the unsigned value 0x80000000.

As this is not guaranteed by the C standard, this code is not portable. The author might better have used INT_MIN instead of 0x80000000 or simply (-2147483647-1).

In ((signed)0x80000000 - d) / 10, consider what would happen if we instead wrote (0x80000000 - d) / 10). Since 0x80000000 is an unsigned int, the subtraction would be evaluated by converting d to an unsigned int and dividing by 10. For example, if d were −1, it would be converted to 0xFFFFFFFF (because conversion to unsigned int is defined by the C standard to wrap), and 0x80000000 - 0xFFFFFFFF would produce 0x80000001, which is 2,147,483,649, and division by 10 would produce 214,748,364. However, since the author cast to signed, meaning signed int or int, the (signed)0x80000000 evaluates to an int value of −2,147,483,648, and the subtraction is evaluated with int arithmetic. Then we have -2147483648 - -1, which produces −2,147,483,647, and division by 10 produces −214,748,364.

So, in the assignment, implicit conversion produces the result the author desires (relying on implementation-defined behavior). In the arithmetic expression, there is no implicit conversion, so the author had to insert one explicitly. This is bad design for several reasons:

  • It needlessly relies on implementation-defined behavior.
  • It uses subtleties of C semantics that can easily go wrong when somebody modifies the expressions or writes new ones that attempt to use 0x80000000 as if it were the minimum int value.
  • It does not document what it is doing or why.
answered on Stack Overflow Aug 1, 2019 by Eric Postpischil • edited Aug 1, 2019 by Eric Postpischil
2

If we assume a compiler with 32 bit int, the constant 0x80000000 is out of range for that type. Therefore, it gets the type unsigned int. If we cast it to signed (which is a shorthand for signed int, which is a long-hand for just int) we then invoke an implementation-defined conversion that the author of the code is relying upon to produce the most negative int value.

answered on Stack Overflow Aug 1, 2019 by Kaz
0

The constant 0x80000000 has type unsigned int because it does not fit into type int. This means conversions may take place when used in expressions with values of type int.

First let's take this case:

b[i] = 0x80000000; 

Since b[i] has type int, the unsigned int value is converted in an implementation defined way to int. Assuming a common implementation such as gcc or MSVC on x64, this results in the value -2147483648 being assigned to b[i]. No cast is required here due to how the conversion happens.

Now lets look at the second case:

    if ((x > 0 && k > (0x7fffffff - d) / 10) ||
        (x < 0 && k < ((signed)0x80000000 - d) / 10)) {

To understand why the cast is necessary, let's see what happens without it:

    if ((x > 0 && k > (0x7fffffff - d) / 10) ||
        (x < 0 && k < (0x80000000 - d) / 10)) {

First we have 0x80000000 - d Because the first operand is unsigned int and the second is signed int, the latter is converted to unsigned int and the resulting expression has type unsigned int. Since the value of d is in the range -9 to 0 subtracting this from 0x80000000 == 2147483648 gives you a range of 2,147,483,648 to 2,147,483,639, and dividing by 10 gives you a range of 214,748,364 to 214,748,363. k is 0 on the first iteration of the loop, so if x is negative then you have 0 < 214,748,363 which is false, causing the function to think there's an overflow and return 0.

By casting 0x80000000 to signed, the expression ((signed)0x80000000 - d) / 10 evaluates to a value from -2,147,483,648 to -2,147,483,639, which is what we want to compare k with.

answered on Stack Overflow Aug 1, 2019 by dbush

User contributions licensed under CC BY-SA 3.0