I'm working on a Jacobi solver for the Poisson equation using Julia. The solver is called iteratively until err
is sufficiently small (~1e-8), which takes around 25,000 loops through the function for my nx = ny = 80
test case. Profiling shows that most of the time is spent in the inner loop (as expected), but memory allocation seems to be running away--the @time macro gives 38 gigabytes allocated in order to reach convergence, which seems way too much since I don't think I'm creating new arrays for each loop.
function jacobi(P::Array{Float64,2}, maxiter::Int64)
P_old = copy(P)
for j = 2:ny-1
# Main body loop
for i = 2:nx-1
@inbounds P[i,j] = ((P_old[i+1,j] + P_old[i-1,j])*dx2
+ (P_old[i,j+1] + P_old[i,j-1])*dy2)/denom-Rmod[i,j]
end
end
err = vecnorm(P::Array{Float64,2}-P_old::Array{Float64,2})/sqrt(nx+ny)
return (P, err)
end
I've timed the function for 1000 loops, calling from a function wrapper (methodwrap
) that sets initial conditions:
function methodwrap(solver, maxiter::Int64) # (solver fn name, max # of iterations)
P = copy(P0)
iter = 1
err = 1.0
maxerr = 1e-8
prog = Progress(maxiter,.2, "Solving using $solver method", 10) # Show progress bar
while (err > maxerr) && (iter < maxiter)
P, err = solver(P, maxiter)
next!(prog) # Iterates progress bar counter
iter += 1
end
println()
return (P, iter, err)
end
Contrary to my wishes, it looks like memory allocation scales with the number of loops, so I'm doing something wrong. It looks as if approximately 1.4 mb is allocated with each Jacobi pass:
julia> @time methodwrap(jacobi,1000)
Solving using jacobi method 98%|##########| ETA: 0:00:00
elapsed time: 4.001988593 seconds (1386549012 bytes allocated, 26.45% gc time)
I've tried reducing the inner loop arrays to vector subarrays and using @simd:
function jacobi2(P::Array{Float64,2}, maxiter::Int64)
P_old = copy(P)::Array{Float64,2}
for j = 2:ny-1
# Main body loop
Pojm = sub(P_old,:,j-1)
Poj = sub(P_old,:,j)
Pojp = sub(P_old,:,j+1)
Pj = sub(P,:,j)
Rmodj = sub(Rmod,:,j)
@simd for i = 2:nx-1
@inbounds Pj[i] = ((Poj[i+1] + Poj[i-1])*dx2
+ (Pojp[i] + Pojm[i])*dy2)/denom-Rmodj[i]
end
end
err = vecnorm(P::Array{Float64,2}-P_old::Array{Float64,2})/sqrt(nx+ny)
return (P, err)
end
However, this only seems to increase memory allocation and decrease speed, and I get a @simd warning:
julia> @time methodwrap(jacobi2,1000);
Warning: could not attach metadata for @simd loop.
Solving using jacobi2 method: 100%|##########| ETA: 0:00:00
elapsed time: 4.947097666 seconds (1455818184 bytes allocated, 29.85% gc time)
This is my first project in Julia, so I'm probably making a really obvious mistake, but I haven't found a solution yet. I've defined global vars as constants. I've gone through the performance tips several times, I've linted the file, I've used TypeCheck to make sure my types are consistent, and everything looks fairly kosher to my eyes. What am I doing wrong? I've posted my full code on Gist if you'd like to check that as well.
It turns out the problem was subtle. I made 3 changes (see below). I did use as @IainDunning suggested --track-allocation=user which pointed to the questionable line. Both of these problems come from using global variables.
After these changes
julia> @time methodwrap(jacobi,1000)
elapsed time: 0.481986712 seconds (116650236 bytes allocated)
You had const everywhere except for these 2 variables but when left non const and global that cause the loop iterator i to allocate unnecessarily.
nx=80 # Number of mesh points in the x-direction
ny=80 # Number of mesh points in the y-direction
was changed to
const nx=80 # Number of mesh points in the x-direction
const ny=80 # Number of mesh points in the y-direction
const Rmod = dx2*dy2*R/(2*(dx2+dy2))
was changed to
const Rmod = convert(Array{Float64,2},dx2*dy2*R/(2*(dx2+dy2)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With