I was messing around with the worst code I could write, (basicly trying to break things) and i noticed that this piece of code:
for(int i = 0; i < N; ++i)
tan(tan(tan(tan(tan(tan(tan(tan(x++))))))));
end
std::cout << x;
where N is a global variable runs significantly slower then:
int N = 10000;
for(int i = 0; i < N; ++i)
tan(tan(tan(tan(tan(tan(tan(tan(x++))))))));
end
std::cout << x;
What happens with a global variable that makes it run slower?
Using global variables slow down the code.
Using global variables causes very tight coupling of code. Using global variables causes namespace pollution. This may lead to unnecessarily reassigning a global value. Testing in programs using global variables can be a huge pain as it is difficult to decouple them when testing.
Overusing local and global variables, such as using them to avoid long wires across the block diagram or using them instead of data flow, slows performance.
Pointers to globals are even more evil. The main problem with using global variables is that they create implicit couplings among various pieces of the program (various routines might set or modify a variable, while several more routines might read it).
tl;dr: Local version keeps N in a register, global version doesn't. Declare constants with const and it'll be faster no matter how you declare it.
Here's the example code I used:
#include <iostream>
#include <math.h>
void first(){
int x=1;
int N = 10000;
for(int i = 0; i < N; ++i)
tan(tan(tan(tan(tan(tan(tan(tan(x++))))))));
std::cout << x;
}
int N=10000;
void second(){
int x=1;
for(int i = 0; i < N; ++i)
tan(tan(tan(tan(tan(tan(tan(tan(x++))))))));
std::cout << x;
}
int main(){
first();
second();
}
(named test.cpp
).
To look at the assembler code generated I ran g++ -S test.cpp
.
I got a huge file but with some smart searching (I searched for tan), I found what I wanted:
from the first
function:
Ltmp2:
movl $1, -4(%rbp)
movl $10000, -8(%rbp) ; N is here !!!
movl $0, -12(%rbp) ;initial value of i is here
jmp LBB1_2 ;goto the 'for' code logic
LBB1_1: ;the loop is this segment
movl -4(%rbp), %eax
cvtsi2sd %eax, %xmm0
movl -4(%rbp), %eax
addl $1, %eax
movl %eax, -4(%rbp)
callq _tan
callq _tan
callq _tan
callq _tan
callq _tan
callq _tan
callq _tan
movl -12(%rbp), %eax
addl $1, %eax
movl %eax, -12(%rbp)
LBB1_2:
movl -12(%rbp), %eax ;value of n kept in register
movl -8(%rbp), %ecx
cmpl %ecx, %eax ;comparing N and i here
jl LBB1_1 ;if less, then go into loop code
movl -4(%rbp), %eax
second function:
Ltmp13:
movl $1, -4(%rbp) ;i
movl $0, -8(%rbp)
jmp LBB5_2
LBB5_1: ;loop is here
movl -4(%rbp), %eax
cvtsi2sd %eax, %xmm0
movl -4(%rbp), %eax
addl $1, %eax
movl %eax, -4(%rbp)
callq _tan
callq _tan
callq _tan
callq _tan
callq _tan
callq _tan
callq _tan
movl -8(%rbp), %eax
addl $1, %eax
movl %eax, -8(%rbp)
LBB5_2:
movl _N(%rip), %eax ;loading N from globals at every iteration, instead of keeping it in a register
movl -8(%rbp), %ecx
So from the assembler code you can see (or not) that , in the local version, N is kept in a register during the entire calculation, whereas in the global version, N is reread from the global at each iteration.
I imagine the main reason why this happens is for things like threading, the compiler can't be sure that N isn't modified.
if you add a const
to the declaration of N (const int N=10000
), it'll be even faster than the local version though:
movl -8(%rbp), %eax
addl $1, %eax
movl %eax, -8(%rbp)
LBB5_2:
movl -8(%rbp), %eax
cmpl $9999, %eax ;9999 used instead of 10000 for some reason I do not know
jle LBB5_1
N is replaced by a constant.
The global version can't be optimized to put it in a register.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With