I have the following code in Python Jupyter:
n = 10**7
d = {}
%timeit for i in range(n): d[i] = i
%timeit for i in range(n): _ = d[i]
%timeit d[10]
with the following times:
763 ms ± 19.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
692 ms ± 3.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
39.5 ns ± 0.186 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
and this on Julia
using BenchmarkTools
d = Dict{Int64, Int64}()
n = 10^7
r = 1:n
@btime begin
for i in r
d[i] = i
end
end
@btime begin
for i in r
_ = d[i]
end
end
@btime d[10]
with times:
2.951 s (29999490 allocations: 610.34 MiB)
3.327 s (39998979 allocations: 762.92 MiB)
20.163 ns (0 allocations: 0 bytes)
What I am not quite able to understand is why the Julia one seems to be so much slower in dictionary value assignation and retrieval in a loop (first two tests), but at the same time is so much faster in single key retrieval (last test). It seems to be 4 times slower when in a loop, but twice as fast if not in a loop. I'm new to Julia, so I am not sure if I am doing something un-optimal or if this is somehow expected.
Python is slow. I bet you might encounter this counterargument many times about using Python, especially from people who come from C or C++ or Java world. This is true in many cases, for instance, looping over or sorting Python arrays, lists, or dictionaries can be sometimes slow.
dict is slower because it calls a function that essentially returns {} . Hence, any occurrences of dict() can be safely replaced with {} . Take care, however, that you're not using {} (or even dict() , since it returns {} ) in variables that will be passed around to a lot of functions.
So, is Julia faster than Python? Thanks to its JIT compilation and type declarations from the stats, Julia has a higher speed than Python. Now is the time to learn Python.
Julia and Python are [pretty similar programming languages – open-source, dynamically typed, support for data science and machine learning, and easy to learn and use. Most beginner programmers will find both languages friendly. However, determining which languages work best for you is a matter of preference and your objectives.
So you can’t say Python is slow and don’t use it. Yes, when it comes to the execution time of code, Python is slower than other languages like Java and C++. Why Is Python Slow at Execution Time?
Julia’s performance is about 4X slower than NumPy. But for the largest arrays, Julia catches up with Python again. Julia and NumPy have similar performance for many tasks ideally – the bottleneck is heavy lifting accomplished by the underlying algebra libraries. Since Julia supports loop functions, it a bit faster in some functions.
Since you are benchmarking in a top-level scope you have to interpolate variables in @btime
with $
so the way to benchmark your code is:
julia> using BenchmarkTools
julia> d = Dict{Int64, Int64}()
Dict{Int64, Int64}()
julia> n = 10^7
10000000
julia> r = 1:n
1:10000000
julia> @btime begin
for i in $r
$d[i] = i
end
end
842.891 ms (0 allocations: 0 bytes)
julia> @btime begin
for i in $r
_ = $d[i]
end
end
618.808 ms (0 allocations: 0 bytes)
julia> @btime $d[10]
6.300 ns (0 allocations: 0 bytes)
10
Timing for Python 3 on the same machine in Jupyter Notebook is:
n = int(10.0**7)
d = {}
%timeit for i in range(n): d[i] = i
913 ms ± 87.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit for i in range(n): _ = d[i]
816 ms ± 92.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit d[10]
50.2 ns ± 2.97 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
However, for the first operation I assume you rather wanted to benchmark this:
julia> function f(n)
d = Dict{Int64, Int64}()
for i in 1:n
d[i] = i
end
end
f (generic function with 1 method)
julia> @btime f($n)
1.069 s (72 allocations: 541.17 MiB)
against this:
def f(n):
d = {}
for i in range(n):
d[i] = i
%timeit f(n)
1.18 s ± 65.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
It should also be noted that using a specific value of n
can be misleading as Julia and Python are not guaranteed to resize their collections at the same moments and to the same new sizes (in order to store a dictionary you normally allocate more memory than needed to avoid hash collisions and here actually a specific value of tested n
might matter).
Note that if I declare global variables as const
all is fast, as then the compiler can optimize the code (it knows the types of the values bound to global variables cannot change); therefore then using $
is not needed:
julia> using BenchmarkTools
julia> const d = Dict{Int64, Int64}()
Dict{Int64, Int64}()
julia> const n = 10^7
10000000
julia> const r = 1:n
1:10000000
julia> @btime begin
for i in r
d[i] = i
end
end
895.788 ms (0 allocations: 0 bytes)
julia> @btime begin
for i in $r
_ = $d[i]
end
end
582.214 ms (0 allocations: 0 bytes)
julia> @btime $d[10]
6.800 ns (0 allocations: 0 bytes)
10
If you are curious what are the benefits of having a native support for threading here is a simple benchmark (this functionality is a part of the language):
julia> Threads.nthreads()
4
julia> @btime begin
Threads.@threads for i in $r
_ = $d[i]
end
end
215.461 ms (23 allocations: 2.17 KiB)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With