Pre-allocation in Julia

Question

I am trying to minimize memory allocations in Julia by pre-allocating arrays as shown in the documentation. My sample code looks as follows:

using BenchmarkTools

dim1 = 100
dim2 = 1000
A = rand(dim1,dim2)
B = rand(dim1,dim2)
C = rand(dim1,dim2)
D = rand(dim1,dim2)

M = Array{Float64}(undef,dim1,dim2)

function calc!(a, b, c, d, E)
     @. E = a * b * ((d-c)/d)
    nothing
end

function run_calc(A,B,C,D,M)
    for i in 1:dim2
        @views calc!(A[:,i], B[:,i], C[:,i], D[:,i], M[:,i])
    end
end

My understanding is that this should essentially not allocate since M is pre-allocated outside the either of the two functions. However, when I benchmark this I still see a lot of allocations:

@btime run_calc(A,B,C,D,M)

1.209 ms (14424 allocations: 397.27 KiB)

In this case I can of course run the much more concise

@btime @. M = A * B * ((D-C)/D)

which performs very few allocations as expected:

122.599 μs (6 allocations: 144 bytes)

However my actual code is more complex and cannot be reduced like this, hence I am wondering where I am going wrong with the first version.

Bogumił Kamiński · Accepted Answer

You are not doing anything wrong. Currently creation of views in Julia is allocating (as Stefan noted it has gotten much better than in the past, but still some allocations seem to happen in this case). The allocations you see are a consequence of this.

See:

julia> @allocated view(M, 1:10, 1:10)
64

Your case is one of the situations where it is simplest to just write an appropriate loop (I assume that in your code the loop will be more complex but I hope the intent is clear), e.g.:

julia> function run_calc2(A,B,C,D,M)
           @inbounds for i in eachindex(A,B,C,D,M)
               M[i] = A[i] * B[i] * ((D[i] - C[i])/D[i])
           end
       end
run_calc2 (generic function with 1 method)

julia> @btime run_calc2($A,$B,$C,$D,$M)
  56.441 μs (0 allocations: 0 bytes)

julia> @btime run_calc($A,$B,$C,$D,$M)
  893.789 μs (14424 allocations: 397.27 KiB)

julia> @btime @. $M = $A * $B * (($D-$C)/$D);
  381.745 μs (0 allocations: 0 bytes)

EDIT: all timings on Julia Version 1.6.0-DEV.1580

EDIT2: for completeness a code that passes @views down to the inner function. It still allocates (but is better) and is still slower than using just the loop:

julia> function calc2!(a, b, c, d, E, i)
            @inbounds @. @views E[:,i] = a[:,i] * b[:,i] * ((d[:,i]-c[:,i])/d[:,i])
           nothing
       end
calc2! (generic function with 1 method)

julia> function run_calc3(A,B,C,D,M)
           for i in 1:dim2
               calc2!(A,B,C,D,M,i)
           end
       end
run_calc3 (generic function with 1 method)

julia> @btime run_calc3($A,$B,$C,$D,$M);
  305.709 μs (1979 allocations: 46.56 KiB)

Pre-allocation in Julia

Tags:

memory-management

julia

jul345

1 Answers

Bogumił Kamiński

Recent Activity

Donate For Us

Pre-allocation in Julia

Tags:

memory-management

julia

jul345

1 Answers

Bogumił Kamiński

Related questions

Recent Activity

Donate For Us