I have the following code in Python Jupyter: <pre class="prettyprint"><code>n = 10**7 d = {} %timeit for i in range(n): d[i] = i %timeit for i in range(n): _ = d[i] %timeit d[10] </code></pre> with the following times: <pre class="prettyprint"><code>763 ms ± 19.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 692 ms ± 3.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 39.5 ns ± 0.186 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) </code></pre> and this on Julia <pre class="prettyprint"><code>using BenchmarkTools d = Dict{Int64, Int64}() n = 10^7 r = 1:n @btime begin for i in r d[i] = i end end @btime begin for i in r _ = d[i] end end @btime d[10] </code></pre> with times: <pre class="prettyprint"><code> 2.951 s (29999490 allocations: 610.34 MiB) 3.327 s (39998979 allocations: 762.92 MiB) 20.163 ns (0 allocations: 0 bytes) </code></pre> What I am not quite able to understand is why the Julia one seems to be so much slower in dictionary value assignation and retrieval in a loop (first two tests), but at the same time is so much faster in single key retrieval (last test). It seems to be 4 times slower when in a loop, but twice as fast if not in a loop. I'm new to Julia, so I am not sure if I am doing something un-optimal or if this is somehow expected.

Since you are benchmarking in a top-level scope you have to interpolate variables in <code>@btime</code> with <code>$</code> so the way to benchmark your code is: <pre class="prettyprint"><code>julia> using BenchmarkTools julia> d = Dict{Int64, Int64}() Dict{Int64, Int64}() julia> n = 10^7 10000000 julia> r = 1:n 1:10000000 julia> @btime begin for i in $r $d[i] = i end end 842.891 ms (0 allocations: 0 bytes) julia> @btime begin for i in $r _ = $d[i] end end 618.808 ms (0 allocations: 0 bytes) julia> @btime $d[10] 6.300 ns (0 allocations: 0 bytes) 10 </code></pre> Timing for Python 3 on the same machine in Jupyter Notebook is: <pre class="prettyprint"><code>n = int(10.0**7) d = {} %timeit for i in range(n): d[i] = i 913 ms ± 87.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit for i in range(n): _ = d[i] 816 ms ± 92.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit d[10] 50.2 ns ± 2.97 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) </code></pre> However, for the first operation I assume you rather wanted to benchmark this: <pre class="prettyprint"><code>julia> function f(n) d = Dict{Int64, Int64}() for i in 1:n d[i] = i end end f (generic function with 1 method) julia> @btime f($n) 1.069 s (72 allocations: 541.17 MiB) </code></pre> against this: <pre class="prettyprint"><code>def f(n): d = {} for i in range(n): d[i] = i %timeit f(n) 1.18 s ± 65.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) </code></pre> It should also be noted that using a specific value of <code>n</code> can be misleading as Julia and Python are not guaranteed to resize their collections at the same moments and to the same new sizes (in order to store a dictionary you normally allocate more memory than needed to avoid hash collisions and here actually a specific value of tested <code>n</code> might matter). <h3>EDIT</h3> Note that if I declare global variables as <code>const</code> all is fast, as then the compiler can optimize the code (it knows the types of the values bound to global variables cannot change); therefore then using <code>$</code> is not needed: <pre class="prettyprint"><code>julia> using BenchmarkTools julia> const d = Dict{Int64, Int64}() Dict{Int64, Int64}() julia> const n = 10^7 10000000 julia> const r = 1:n 1:10000000 julia> @btime begin for i in r d[i] = i end end 895.788 ms (0 allocations: 0 bytes) julia> @btime begin for i in $r _ = $d[i] end end 582.214 ms (0 allocations: 0 bytes) julia> @btime $d[10] 6.800 ns (0 allocations: 0 bytes) 10 </code></pre> If you are curious what are the benefits of having a native support for threading here is a simple benchmark (this functionality is a part of the language): <pre class="prettyprint"><code>julia> Threads.nthreads() 4 julia> @btime begin Threads.@threads for i in $r _ = $d[i] end end 215.461 ms (23 allocations: 2.17 KiB) </code></pre>

Why is this Julia snippet so much slower than the Python equivalent? (with dictionaries)

Tags:

python

julia

I have the following code in Python Jupyter:

n = 10**7
d = {}

%timeit for i in range(n): d[i] = i

%timeit for i in range(n): _ = d[i]

%timeit d[10]

with the following times:

763 ms ± 19.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
692 ms ± 3.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
39.5 ns ± 0.186 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

and this on Julia

using BenchmarkTools
d = Dict{Int64, Int64}()
n = 10^7
r = 1:n

@btime begin
    for i in r
        d[i] = i
    end
end
    
@btime begin
    for i in r
        _ = d[i]
    end
end

@btime d[10]

with times:

  2.951 s (29999490 allocations: 610.34 MiB)
  3.327 s (39998979 allocations: 762.92 MiB)
  20.163 ns (0 allocations: 0 bytes)

What I am not quite able to understand is why the Julia one seems to be so much slower in dictionary value assignation and retrieval in a loop (first two tests), but at the same time is so much faster in single key retrieval (last test). It seems to be 4 times slower when in a loop, but twice as fast if not in a loop. I'm new to Julia, so I am not sure if I am doing something un-optimal or if this is somehow expected.

779

asked Jun 14 '21 17:06

Jano

1 Answers

Since you are benchmarking in a top-level scope you have to interpolate variables in @btime with $ so the way to benchmark your code is:

julia> using BenchmarkTools

julia> d = Dict{Int64, Int64}()
Dict{Int64, Int64}()

julia> n = 10^7
10000000

julia> r = 1:n
1:10000000

julia> @btime begin
           for i in $r
               $d[i] = i
           end
       end
  842.891 ms (0 allocations: 0 bytes)

julia> @btime begin
           for i in $r
               _ = $d[i]
           end
       end
  618.808 ms (0 allocations: 0 bytes)

julia> @btime $d[10]
  6.300 ns (0 allocations: 0 bytes)
10

Timing for Python 3 on the same machine in Jupyter Notebook is:

n = int(10.0**7)
d = {}
%timeit for i in range(n): d[i] = i
913 ms ± 87.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit for i in range(n): _ = d[i]
816 ms ± 92.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit d[10]
50.2 ns ± 2.97 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

However, for the first operation I assume you rather wanted to benchmark this:

julia> function f(n)
           d = Dict{Int64, Int64}()
           for i in 1:n
               d[i] = i
           end
       end
f (generic function with 1 method)

julia> @btime f($n)
  1.069 s (72 allocations: 541.17 MiB)

against this:

def f(n):
    d = {}
    for i in range(n):
        d[i] = i
%timeit f(n)
1.18 s ± 65.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

It should also be noted that using a specific value of n can be misleading as Julia and Python are not guaranteed to resize their collections at the same moments and to the same new sizes (in order to store a dictionary you normally allocate more memory than needed to avoid hash collisions and here actually a specific value of tested n might matter).

EDIT

Note that if I declare global variables as const all is fast, as then the compiler can optimize the code (it knows the types of the values bound to global variables cannot change); therefore then using $ is not needed:

julia> using BenchmarkTools

julia> const d = Dict{Int64, Int64}()
Dict{Int64, Int64}()

julia> const n = 10^7
10000000

julia> const r = 1:n
1:10000000

julia> @btime begin
           for i in r
               d[i] = i
           end
       end
  895.788 ms (0 allocations: 0 bytes)

julia> @btime begin
           for i in $r
               _ = $d[i]
           end
       end
  582.214 ms (0 allocations: 0 bytes)

julia> @btime $d[10]
  6.800 ns (0 allocations: 0 bytes)
10

If you are curious what are the benefits of having a native support for threading here is a simple benchmark (this functionality is a part of the language):

julia> Threads.nthreads()
4

julia> @btime begin
           Threads.@threads for i in $r
               _ = $d[i]
           end
       end
  215.461 ms (23 allocations: 2.17 KiB)

127

answered Oct 23 '22 19:10

Bogumił Kamiński

Related questions
                            
                                The smtplib.server.sendmail function in python raises UnicodeEncodeError: 'ascii' codec can't encode character
                            
                                Show text inside the tags BeautifulSoup
                            
                                How to build a simple RSS reader in Python 3.7?
                            
                                Cmake could not find boost_python
                            
                                Do JavaScript classes have a method equivalent to Python classes' __call__?
                            
                                How to suppress warning "Access to protected member" in pycharm method?
                            
                                Saving a TF2 keras model with custom signature defs
                            
                                How should I add a field containing a list of dictionaries in Marshmallow Python?
                            
                                speed of elementary mathematical operations in Numpy/Python: why is integer division slowest?
                            
                                How to convert a QByteArray to a python string in PySide2 [duplicate]
                            
                                Counting most common combination of values in dataframe column
                            
                                Tox 0% coverage
                            
                                Debugging Airflow Tasks with IDE tools?
                            
                                Could validation data be a generator in tensorflow.keras 2.0?
                            
                                How to load TF hub model from local system
                            
                                How to connect kafka topic with web endpoint using Faust Python package?
                            
                                How to overwrite data on an existing excel sheet while preserving all other sheets?
                            
                                Is One-Hot Encoding required for using PyTorch's Cross Entropy Loss Function?
                            
                                expand 1 dim vector by using taylor series of log(1+e^x) in python
                            
                                TypeError: function takes exactly 4 arguments (2 given) in cv2.rectangle function

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With