I'm new on python. I want to learn how to parallel processing in python. I saw the following example:
import multiprocessing as mp
np.random.RandomState(100)
arr = np.random.randint(0, 10, size=[20, 5])
data = arr.tolist()
def howmany_within_range_rowonly(row, minimum=4, maximum=8):
count = 0
for n in row:
if minimum <= n <= maximum:
count = count + 1
return count
pool = mp.Pool(mp.cpu_count())
results = pool.map(howmany_within_range_rowonly, [row for row in data])
pool.close()
print(results[:10])
but when I run it, this error happened:
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
What should I do?
If you place everything in global scope inside this if __name__ == "__main__"
block as follows, you should find that your program behaves as you expect:
def howmany_within_range_rowonly(row, minimum=4, maximum=8):
count = 0
for n in row:
if minimum <= n <= maximum:
count = count + 1
return count
if __name__ == "__main__":
np.random.RandomState(100)
arr = np.random.randint(0, 10, size=[20, 5])
data = arr.tolist()
pool = mp.Pool(mp.cpu_count())
results = pool.map(howmany_within_range_rowonly, [row for row in data])
pool.close()
print(results[:10])
Without this protection, if your current module was imported from a different module, your multiprocessing code would be executed. This could occur within a non-main process spawned in another Pool and spawning processes from sub-processes is not allowed, hence we protect against this problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With