I want to run some code in parallel and populate a global variable with the results in Python. I have written an example code to check the behavior of joblib, but I don't know how to get the results back. The example code is:
import numpy as np
import multiprocessing
from joblib import Parallel, delayed
global_var = np.zeros(10)
def populate(idx):
print('I am core',idx,'\')
global_var[idx] = idx
num_cores = multiprocessing.cpu_count()
Parallel(n_jobs=num_cores)(delayed(populate)(idx) for idx in range(len(global_var))`
if I check global_var before running anything else, it is an array of zeros; when I run the code, the array is full of "None" values.
How can I return the values from the function and populate the global array?
Thank you very much in advance! =)
I know it's an old thread but it might interest others to know that THIS IS POSSIBLE!
Add require='sharedmem'
to the Parallel initialization.
You can read this link for more examples about parallelization loops.
So in your example:
import numpy as np
import multiprocessing
from joblib import Parallel, delayed
global_var = np.zeros(10)
def populate(idx):
print('I am core',idx,'\'')
global_var[idx] = idx
num_cores = multiprocessing.cpu_count()
Parallel(n_jobs=num_cores, require='sharedmem')(delayed(populate)(idx) for idx in range(len(global_var)))
print(global_var)
[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
Basically, you can't do it this way: you'll need to specify a backend with shared memory, or manually create a shared memory. This is a little bit more involved (but covered by the documentation).
Here, the easiest way to achieve it is to define your function such that it returns the result of its computation, and then you process these results (returned by the call to Parallel(..)(..)
in the main process (after the line starting with Parallel
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With