Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this Julia snippet so much slower than the Python equivalent? (with dictionaries)

Tags:

python

julia

I have the following code in Python Jupyter:

n = 10**7
d = {}

%timeit for i in range(n): d[i] = i

%timeit for i in range(n): _ = d[i]

%timeit d[10]

with the following times:

763 ms ± 19.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
692 ms ± 3.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
39.5 ns ± 0.186 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

and this on Julia

using BenchmarkTools
d = Dict{Int64, Int64}()
n = 10^7
r = 1:n

@btime begin
    for i in r
        d[i] = i
    end
end
    
@btime begin
    for i in r
        _ = d[i]
    end
end

@btime d[10]

with times:

  2.951 s (29999490 allocations: 610.34 MiB)
  3.327 s (39998979 allocations: 762.92 MiB)
  20.163 ns (0 allocations: 0 bytes)

What I am not quite able to understand is why the Julia one seems to be so much slower in dictionary value assignation and retrieval in a loop (first two tests), but at the same time is so much faster in single key retrieval (last test). It seems to be 4 times slower when in a loop, but twice as fast if not in a loop. I'm new to Julia, so I am not sure if I am doing something un-optimal or if this is somehow expected.

like image 779
Jano Avatar asked Jun 14 '21 17:06

Jano


People also ask

Are Python dictionaries slow?

Python is slow. I bet you might encounter this counterargument many times about using Python, especially from people who come from C or C++ or Java world. This is true in many cases, for instance, looping over or sorting Python arrays, lists, or dictionaries can be sometimes slow.

Why is Dict slower than{}?

dict is slower because it calls a function that essentially returns {} . Hence, any occurrences of dict() can be safely replaced with {} . Take care, however, that you're not using {} (or even dict() , since it returns {} ) in variables that will be passed around to a lot of functions.

Is Julia faster than Python?

So, is Julia faster than Python? Thanks to its JIT compilation and type declarations from the stats, Julia has a higher speed than Python. Now is the time to learn Python.

Which programming language should I learn first - Julia or Python?

Julia and Python are [pretty similar programming languages – open-source, dynamically typed, support for data science and machine learning, and easy to learn and use. Most beginner programmers will find both languages friendly. However, determining which languages work best for you is a matter of preference and your objectives.

Is Python slower than other languages?

So you can’t say Python is slow and don’t use it. Yes, when it comes to the execution time of code, Python is slower than other languages like Java and C++. Why Is Python Slow at Execution Time?

Is NumPy faster than Julia?

Julia’s performance is about 4X slower than NumPy. But for the largest arrays, Julia catches up with Python again. Julia and NumPy have similar performance for many tasks ideally – the bottleneck is heavy lifting accomplished by the underlying algebra libraries. Since Julia supports loop functions, it a bit faster in some functions.


1 Answers

Since you are benchmarking in a top-level scope you have to interpolate variables in @btime with $ so the way to benchmark your code is:

julia> using BenchmarkTools

julia> d = Dict{Int64, Int64}()
Dict{Int64, Int64}()

julia> n = 10^7
10000000

julia> r = 1:n
1:10000000

julia> @btime begin
           for i in $r
               $d[i] = i
           end
       end
  842.891 ms (0 allocations: 0 bytes)

julia> @btime begin
           for i in $r
               _ = $d[i]
           end
       end
  618.808 ms (0 allocations: 0 bytes)

julia> @btime $d[10]
  6.300 ns (0 allocations: 0 bytes)
10

Timing for Python 3 on the same machine in Jupyter Notebook is:

n = int(10.0**7)
d = {}
%timeit for i in range(n): d[i] = i
913 ms ± 87.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit for i in range(n): _ = d[i]
816 ms ± 92.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit d[10]
50.2 ns ± 2.97 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

However, for the first operation I assume you rather wanted to benchmark this:

julia> function f(n)
           d = Dict{Int64, Int64}()
           for i in 1:n
               d[i] = i
           end
       end
f (generic function with 1 method)

julia> @btime f($n)
  1.069 s (72 allocations: 541.17 MiB)

against this:

def f(n):
    d = {}
    for i in range(n):
        d[i] = i
%timeit f(n)
1.18 s ± 65.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

It should also be noted that using a specific value of n can be misleading as Julia and Python are not guaranteed to resize their collections at the same moments and to the same new sizes (in order to store a dictionary you normally allocate more memory than needed to avoid hash collisions and here actually a specific value of tested n might matter).

EDIT

Note that if I declare global variables as const all is fast, as then the compiler can optimize the code (it knows the types of the values bound to global variables cannot change); therefore then using $ is not needed:

julia> using BenchmarkTools

julia> const d = Dict{Int64, Int64}()
Dict{Int64, Int64}()

julia> const n = 10^7
10000000

julia> const r = 1:n
1:10000000

julia> @btime begin
           for i in r
               d[i] = i
           end
       end
  895.788 ms (0 allocations: 0 bytes)

julia> @btime begin
           for i in $r
               _ = $d[i]
           end
       end
  582.214 ms (0 allocations: 0 bytes)

julia> @btime $d[10]
  6.800 ns (0 allocations: 0 bytes)
10

If you are curious what are the benefits of having a native support for threading here is a simple benchmark (this functionality is a part of the language):

julia> Threads.nthreads()
4

julia> @btime begin
           Threads.@threads for i in $r
               _ = $d[i]
           end
       end
  215.461 ms (23 allocations: 2.17 KiB)
like image 127
Bogumił Kamiński Avatar answered Oct 23 '22 19:10

Bogumił Kamiński