I have a memory hungry application that iterates through a pair of arrays, processing every combination of elements. Currently this script is very RAM-hungry. It ends up using around 3GB of RAM.
My question is this:
Is it more memory efficient to process each combination of elements in one large process? Or is it better to start a new subprocess for each combination.
In other words, is it better to do option 1:
for i in param_set1:
for j in paramset2:
Do_A_Big_Job(i, j)
Or option 2:
import subprocess
for i in param_set1:
for j in paramset2:
subprocess.call([Do_A_Big_Job_Script, i, j])
By "better", I mean "use less RAM".
Thanks!
Edit I'm explicitly curious about memory usage. When the process ends, would a UNIX system free up that memory? Is this more efficient than python's garbage collection for a reasonably well written script? I don't have a lot of cores available, so I would expect the multiple processes to run more or less in parallel anyway.
Running a single process will use less RAM of course, but it makes it difficult to take advantage of multiple cpus/cores.
If you don't care how long it takes, run a single process.
As a compromise, you could run just a few processes at a time instead of launching them all at once.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With