I wish to run several instances of a simulation in parallel, but with each simulation having its own independent data set.
Currently I implement this as follows:
P = mp.Pool(ncpus) # Generate pool of workers for j in range(nrun): # Generate processes sim = MDF.Simulation(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat,savetemp) lattice = MDF.Lattice(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat, kb, ks, kbs, a, p, q, massL, randinit, initvel, parangle,scaletemp,savetemp) adatom1 = MDF.Adatom(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat, ra, massa, amorse, bmorse, r0, z0, name, lattice, samplerate,savetemp) P.apply_async(run,(j,sim,lattice,adatom1),callback=After) # run simulation and ISF analysis in each process P.close() P.join() # start processes
where sim
, adatom1
and lattice
are objects passed to the function run
which initiates the simulation.
However, I recently found out that each batch I run simultaneously (that is, each ncpus
runs out of the total nrun
of simulations runs) gives the exact same results.
Can someone here enlighten how to fix this?
Python Random seed() Method The random number generator needs a number to start with (a seed value), to be able to generate a random number. By default the random number generator uses the current system time. Use the seed() method to customize the start number of the random number generator.
The only important point we need to understand is that using different seeds will cause NumPy to produce different pseudo-random numbers. The output of a numpy. random function will depend on the seed that you use.
In the Python random module, the . seed() method is used to create a pseudo-random number generator. Pseudo-random number generators appear to produce random numbers by performing some operation on a value. This value is the seed and it sets the first “random” value of the number sequence.
Synchronization between processes Multiprocessing is a package which supports spawning processes using an API. This package is used for both local and remote concurrencies. Using this module, programmer can use multiple processors on a given machine. It runs on Windows and UNIX os.
Just thought I would add an actual answer to make it clear for others.
Quoting the answer from aix in this question:
What happens is that on Unix every worker process inherits the same state of the random number generator from the parent process. This is why they generate identical pseudo-random sequences.
Use the random.seed() method (or the scipy/numpy equivalent) to set the seed properly. See also this numpy thread.
This is an unsolved problem. Try to generate a unique seed for each process. You can add below code to beginning of your function to overcome the issue.
np.random.seed((os.getpid() * int(time.time())) % 123456789)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With