The max_rem
function computes the maximum remainder that (a+1)^n + (a-1)^n
leaves when divided by a²
for n = 1, 2, 3...
. The main
calls max_rem
on every a
from 3
to 999
. Complete code:
#include <inttypes.h>
#include <stdio.h>
int max_rem(int a) {
int max_r = 0;
int m = a * a; // <-------- offending line
int r1 = a+1, r2 = a-1;
for(int n = 1; n <= a*a; n++) {
r1 = (r1 * (a + 1)) % m;
r2 = (r2 * (a - 1)) % m;
int r = (r1 + r2) % m;
if(max_r < r)
max_r = r;
}
return max_r;
}
int main() {
int64_t sum = 0;
for(int a = 3; a < 1000; a++)
sum += max_rem(a);
printf("%ld\n", sum);
}
If I change line 6 from:
int m = a * a;
to
int64_t m = a * a;
the whole computation becames about 150% slower. I tried both with gcc 5.3
and clang 3.6
.
With int
:
$ gcc -std=c99 -O3 -Wall -o 120 120.c
$ time(./120)
real 0m3.823s
user 0m3.816s
sys 0m0.000s
with int64_t
:
$ time(./120)
real 0m9.861s
user 0m9.836s
sys 0m0.000s
and yes, I'm on a 64-bit system. Why does this happen?
I've always assumed that using int64_t
is safer and more portable and "the modern way to write C"® and wouldn't harm performances on 64bits systems for numeric code. Is this assumption erroneous?
EDIT: just to be clear: the slowdown persists even if you change every variable to int64_t
. So this is not a problem with mixing int
and int64_t
.
I've always assumed that using int64_t is safer and more portable and "the modern way to write C"® and wouldn't harm performances on 64bits systems for numeric code. Is this assumption erroneous?
It seems so to me. You can find the instruction timings in Intel's Software Optimization Reference manual (appendix C, table C-17 General Purpose Instructions on page 645):
IDIV r64 Throughput 85-100 cycles per instruction IDIV r32 Throughput 20-26 cycles per instruction
TL;DR: You see different performance with the change of types because you are measuring different computations -- one with all 32-bit data, the other with partially or all 64-bit data.
I've always assumed that using int64_t is safer and more portable and "the modern way to write C"®
int64_t
is the safest and most portable (among conforming C99 and C11 compilers) way to refer to a 64-bit signed integer type with no padding bits and a two's complement representation, if the implementation in fact provides such a type. Whether using this type actually makes your code more portable depends on whether the code depends on any of those specific characteristics of integer representation, and on whether you are concerned with portability to environments that do not provide such a type.
and wouldn't harm performances on 64bits systems for numeric code. Is this assumption erroneous?
int64_t
is specified to be a typedef
. On any given system, using int64_t
is semantically identical to directly using the type that underlies the typedef on that system. You will see no performance difference between those alternatives.
However, your line of reasoning and question seem to belie an assumption: either that on the system where you perform your tests, the basic type underlying int64_t
is int
, or that 64-bit arithmetic will perform identically to 32-bit arithmetic on that system. Neither of those assumptions is justified. It is by no means guaranteed that C implementations for 64-bit systems will make int
a 64-bit type, and in particular, neither GCC not Clang for x86_64 does so. Moreover, C has nothing whatever to say about the relative performance of arithmetic on different types, and as others have pointed out, native x86_64 integer division instructions are in fact slower for 64-bit operands than for 32-bit operands. Other platforms might exhibit other differences.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With