Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference of <long>/<long> vs. <int>/<int>

When compiling the following code:

int f(int i1,int i2)
{
    long l1=i1;
    long l2=i2;
    return l1*l2;
}

with clang 10.1 for x86-64 and with -O3, I get

    mov     eax, edi
    imul    eax, esi
    ret

The compiler recognizes, that a full 64 bit operation is not needed.

However, when I replace the multiplication by a division:

int f(int i1,int i2)
{
    long l1=i1;
    long l2=i2;
    return l1/l2;
}

it compiles into

    movsx   rax, edi
    movsx   rsi, esi
    cqo
    idiv    rsi
    ret

So it uses a 64-bit division (same for gcc).

What is the counter-example which prevents to use a 32-bit division here?

like image 592
FPK Avatar asked May 17 '20 18:05

FPK


1 Answers

Consider what happens when i1 == INT_MIN == -2147483648 and i2 == -1.

For comparison, let's also consider

int g(int i1, int i2) {
    return i1/i2;
}

which compiles to a simple 32-bit idiv.

If you call g(INT_MIN, -1), the division will overflow, since the result 2147483648 doesn't fit in int. This results in undefined behavior at the level of C, and in fact the idiv instruction will generate an exception.

If you instead call f(INT_MIN, -1), the division does not overflow, because the result does indeed fit in a long. Now the return statement causes it to be converted to int by the usual integer conversions. Since the value doesn't fit in the signed type int, the result is implementation-defined, and gcc documents what it will do:

The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

So it needs to generate code that ensures that an exception is not generated, and that the value returned is -2147483648 (which is equivalent to 2147483648 mod 2^32). A 32-bit divide won't do it, but a 64-bit divide will.

Interestingly, icc handles this by special-casing i2 == -1 and doing a 64-bit divide only in that case, and a 32-bit divide otherwise. This might be reasonable as it seems that 64-bit IDIV is potentially several times more expensive than 32-bit, glancing at Agner Fog's instruction tables. Though you might wish it would use NEG instead of dividing in that case (and if you're wondering, yes, NEG of INT_MIN is INT_MIN as desired).

(Actually, the special casing of -1 by icc was the hint that helped me realize the counterexample.)

No such special handling is needed for multiplication, because the behavior of imul on overflow is already what is needed for conversion: it truncates to 32 bits with no exception.

like image 72
Nate Eldredge Avatar answered Sep 30 '22 17:09

Nate Eldredge