I aim to run a 10,000 or so Julia-coded simulations in parallel (each simulation is independent of all the others) on a cluster. Each simulation has a single number to output (along with 3 columns of info about which simulation has produced this number). It therefore sounds a bit stupid to me to force each simulation to print on a separate file.
Can I safely ask all these simulations to write on the same file or might this cause a bug if two simulations happen to write on the file at the exact same time? What is the best solution?
Here is a brief example of one way in which a set of 10000 independent simulations can be set up to run in parallel in Julia, using pmap()
:
@everywhere function simulate(i)
# we compute the simulation results here. In this case we just return
# the simulation number and a random value
x = rand()
return (i,x)
end
x = pmap(simulate,1:10000)
# x is the array of tuples returned from all the simulations
showall(x)
# ... or we could write x to a file or do something else with it
@everywhere
is needed to ensure that the simulate()
function is available to all processes rather than just one process. pmap()
calls simulate()
once for each of the values in the second parameter, in parallel, and returns an array of all the results produced by simulate()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With