I am very puzzled by the performance of the following code which the for loop runs much slower after substituting the "2" with a variable "z=2":
julia> @elapsed for a=1:2:24996
for b=1:2:24996
end
end
2.0e-7
julia> z=2
2
julia> @elapsed for a=1:z:24996
for b=1:z:24996
end
end
14.455516599
Any ideas about the cause and how to prevent such a delay? Thanks!
This is because z is defined in the global scope and is not constant, which means it could change value any time. And even more importantly, it could change type as well. This prevents the Julia compiler from doing a lot of optimizations.
TLDR: always try and avoid using non-constant variables in global scope for performance-critical operations!
If, however, you define your value as a constant (example with w below) or your variable is defined in local scope (example with y below), then the loops get compiled to code as efficient as if a literal was used:
julia> @elapsed for a=1:2:24996
for b=1:2:24996
end
end
1.16e-7
julia> z=2
2
julia> @elapsed for a=1:z:24996
for b=1:z:24996
end
end
10.726003104
julia> const w = 2
2
julia> @elapsed for a=1:w:24996
for b=1:w:24996
end
end
1.58e-7
julia> @elapsed let y=2
for a=1:y:24996
for b=1:y:24996
end
end
end
1.05e-7
Please also note that your benchmark suffers from a few issues.
First, your benchmark does not compute anything. This is a problem, because as soon as the compiler is free to do any optimization, it optimizes everything away.
The runtime you're measuring here is more related to compilation than it is to the size of your problem. See for example what happens below when the problem size increases:
julia> @elapsed for a=1:2:2*24996
for b=1:2:2*24996
end
end
1.21e-7
julia> @elapsed for a=1:2:4*24996
for b=1:2:4*24996
end
end
1.31e-7
Second, it is advised to perform such micro-benchmarks on code that is wrapped in functions, and using the macros provided by BenchmarkTools (e.g BenchmarkTools.@belapsed should be preferred to @elapsed), so that more accurate measurements can be performed, and compilation times are omitted.
Fixing all this, we see in a more convincing way that the code with a local variable inside a function is as fast as the code with a constant literal:
julia> function foo(n)
s = 0.
for a=1:2:n
for b=1:2:n
s += a+b
end
end
s
end
foo (generic function with 1 method)
julia> @btime foo(24996)
224.787 ms (0 allocations: 0 bytes)
3.904375299984e12
julia> function foo(n, z)
s = 0.
for a=1:z:n
for b=1:z:n
s += a+b
end
end
s
end
foo (generic function with 2 methods)
julia> @btime foo(24996, 2)
224.762 ms (0 allocations: 0 bytes)
3.904375299984e12
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With