My machine has 4 cores. When I do parallel runs with @sync @parallel, I notice that Julia divides the jobs into 4 before sending the jobs to the 4 processors:
# start of do_something.jl
function do_something(i, parts)
procs = zeros(Int, parts)
procs[i] = myid()
total = 0.0
for j = 1:i * 100000000
total = total + 1e-6
end
return procs
end
# end of do_something.jl
# synctest3a.jl
addprocs(Sys.CPU_CORES)
@everywhere include("do_something.jl")
parts = 20
procs = @sync @parallel (+) for i = 1:parts
do_something(i, parts)
end
@printf("procs=%s\n", procs)
Result of julia synctest3a.jl, indicating the first 5 were sent to processor 2, the next 5 were sent to processor 3, and so on:
procs=[2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5]
I have an application where the time to execute do_something() can vary a lot (in this toy example it is more or less proportional to i). So what I really want is for each processor to execute do_something as soon it is free, rather than each one always doing exactly 1/4 of the calls. How do I do that?
I think you should use pmap
instead. It has a batch_size
argument, which is 1 by default, meaning that parts will be sent to free workers one-by-one. With pmap
, of course, you have to handle the reduction operation. Note that I have tried your function with pmap
and observed the behavior you asked.
Another option to control scheduling behavior is defining your own pmap
(the name does not matter, of course) function. In this way, you can have much more control on the scheduling. For example, you can change the scheduling based on the results from previous computations. See here for an example of pmap
definition and how to define one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With