Why does using mod with int64_t operand makes this function 150% slower?

Question

The max_rem function computes the maximum remainder that (a+1)^n + (a-1)^n leaves when divided by a² for n = 1, 2, 3.... The main calls max_rem on every a from 3 to 999. Complete code:

#include <inttypes.h>
#include <stdio.h>

int max_rem(int a) {
    int max_r = 0;
    int m = a * a; // <-------- offending line
    int r1 = a+1, r2 = a-1;
    for(int n = 1; n <= a*a; n++) {
        r1 = (r1 * (a + 1)) % m;
        r2 = (r2 * (a - 1)) % m;
        int r = (r1 + r2) % m;
        if(max_r < r) 
            max_r = r;
    }
    return max_r;
}

int main() {
    int64_t sum = 0;
    for(int a = 3; a < 1000; a++)
        sum += max_rem(a);

    printf("%ld
", sum);
}

If I change line 6 from:

int m = a * a;

to

int64_t m = a * a;

the whole computation becames about 150% slower. I tried both with gcc 5.3 and clang 3.6.

With int:

$ gcc -std=c99 -O3 -Wall -o 120 120.c
$ time(./120)

real    0m3.823s
user    0m3.816s
sys     0m0.000s

with int64_t:

$ time(./120)

real    0m9.861s
user    0m9.836s
sys     0m0.000s

and yes, I'm on a 64-bit system. Why does this happen?

I've always assumed that using int64_t is safer and more portable and "the modern way to write C"® and wouldn't harm performances on 64bits systems for numeric code. Is this assumption erroneous?

EDIT: just to be clear: the slowdown persists even if you change every variable to int64_t. So this is not a problem with mixing int and int64_t.

jpa · Accepted Answer

I've always assumed that using int64_t is safer and more portable and "the modern way to write C"® and wouldn't harm performances on 64bits systems for numeric code. Is this assumption erroneous?

It seems so to me. You can find the instruction timings in Intel's Software Optimization Reference manual (appendix C, table C-17 General Purpose Instructions on page 645):

    IDIV r64   Throughput 85-100 cycles per instruction
    IDIV r32   Throughput 20-26 cycles per instruction

John Bollinger · Answer

TL;DR: You see different performance with the change of types because you are measuring different computations -- one with all 32-bit data, the other with partially or all 64-bit data.

I've always assumed that using int64_t is safer and more portable and "the modern way to write C"®

int64_t is the safest and most portable (among conforming C99 and C11 compilers) way to refer to a 64-bit signed integer type with no padding bits and a two's complement representation, if the implementation in fact provides such a type. Whether using this type actually makes your code more portable depends on whether the code depends on any of those specific characteristics of integer representation, and on whether you are concerned with portability to environments that do not provide such a type.

and wouldn't harm performances on 64bits systems for numeric code. Is this assumption erroneous?

int64_t is specified to be a typedef. On any given system, using int64_t is semantically identical to directly using the type that underlies the typedef on that system. You will see no performance difference between those alternatives.

However, your line of reasoning and question seem to belie an assumption: either that on the system where you perform your tests, the basic type underlying int64_t is int, or that 64-bit arithmetic will perform identically to 32-bit arithmetic on that system. Neither of those assumptions is justified. It is by no means guaranteed that C implementations for 64-bit systems will make int a 64-bit type, and in particular, neither GCC not Clang for x86_64 does so. Moreover, C has nothing whatever to say about the relative performance of arithmetic on different types, and as others have pointed out, native x86_64 integer division instructions are in fact slower for 64-bit operands than for 32-bit operands. Other platforms might exhibit other differences.

Why does using mod with int64_t operand makes this function 150% slower?

Tags:

performance

c

Haile

2 Answers

jpa

John Bollinger

Recent Activity

Donate For Us

Why does using mod with int64_t operand makes this function 150% slower?

Tags:

performance

c

Haile

2 Answers

jpa

John Bollinger

Related questions

Recent Activity

Donate For Us