Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Julia: Macro threads and parallel

as we know, Julia supports parallelism and this is something rooted in the language which is very good.

I recently saw that Julia supports threads but it seems to me to be experimental. I noticed that in the case of using the Threads.@Threads macro there is no need for Shared Arrays which is perhaps a computational advantage since no copies of the objects are performed. I also saw that there is the advantage of not declaring all functions with @everywhere.

Can anyone tell me the advantage of using the @parallel macro instead of the @threads macro?

Below are two simple examples of using non-synchronized macros for parallelism.

Using the @threads macro

addprocs(Sys.CPU_CORES)

function f1(b)
   b+1 
end

function f2(c)
   f1(c)
end

result = Vector(10)

@time Threads.@threads for i = 1:10
  result[i] = f2(i)
end 

0.015273 seconds (6.42 k allocations: 340.874 KiB)

Using the @parallel macro

addprocs(Sys.CPU_CORES)

@everywhere function f1(b)
   b+1 
end

@everywhere function f2(c)
   f1(c)
end

result = SharedArray{Float64}(10)
@time @parallel for i = 1:10
    result[i] = f2(i)
end

0.060588 seconds (68.66 k allocations: 3.625 MiB)

It seems to me that for Monte Carlo simulations where loops are mathematically independent and there is a need for a lot of computational performance the use of the @threads macro is more convenient. What do you think the advantages and disadvantages of using each of the macros?

Best regards.

like image 946
Pedro Rafael Avatar asked Jun 11 '18 16:06

Pedro Rafael


1 Answers

Here is my experience:

Threads

Pros:

  • shared memory
  • low cost of spawning Julia with many threads

Cons:

  • constrained to a single machine
  • number of threads must be specified at Julia start
  • possible problems with false sharing (https://en.wikipedia.org/wiki/False_sharing)
  • often you have to use locking or atomic operations for the program to work correctly; in particular many functions in Julia are not threadsafe so you have to be careful using them
  • not guaranteed to stay in the current form past Julia 1.0

Processess

Pros:

  • better scaling (you can spawn them e.g. on a cluster of multiple machines)
  • you can add processes while Julia is running

Cons:

  • low efficiency when you have to pass a lot of data between processes
  • slower to start
  • you have to explicitly share code and data to/between workers

Summary

Processes are much easier to work with and scale better. In most situations they give you enough performance. If you have large data transfers between parallel jobs threads will be better but are much more delicate to correctly use and tune.

like image 181
Bogumił Kamiński Avatar answered Oct 02 '22 20:10

Bogumił Kamiński