I am attempting to use joblib in python to speed up some data processing but I'm having issues trying to work out how to assign the output into the required format. I have tried to generate a, perhaps overly simplistic, code which shows the issues that I'm encountering:
from joblib import Parallel, delayed
import numpy as np
def main():
print "Nested loop array assignment:"
regular()
print "Parallel nested loop assignment using a single process:"
par2(1)
print "Parallel nested loop assignment using multiple process:"
par2(2)
def regular():
# Define variables
a = [0,1,2,3,4]
b = [0,1,2,3,4]
# Set array variable to global and define size and shape
global ab
ab = np.zeros((2,np.size(a),np.size(b)))
# Iterate to populate array
for i in range(0,np.size(a)):
for j in range(0,np.size(b)):
func(i,j,a,b)
# Show array output
print ab
def par2(process):
# Define variables
a2 = [0,1,2,3,4]
b2 = [0,1,2,3,4]
# Set array variable to global and define size and shape
global ab2
ab2 = np.zeros((2,np.size(a2),np.size(b2)))
# Parallel process in order to populate array
Parallel(n_jobs=process)(delayed(func2)(i,j,a2,b2) for i in xrange(0,np.size(a2)) for j in xrange(0,np.size(b2)))
# Show array output
print ab2
def func(i,j,a,b):
# Populate array
ab[0,i,j] = a[i]+b[j]
ab[1,i,j] = a[i]*b[j]
def func2(i,j,a2,b2):
# Populate array
ab2[0,i,j] = a2[i]+b2[j]
ab2[1,i,j] = a2[i]*b2[j]
# Run script
main()
The ouput of which looks like this:
Nested loop array assignment:
[[[ 0. 1. 2. 3. 4.]
[ 1. 2. 3. 4. 5.]
[ 2. 3. 4. 5. 6.]
[ 3. 4. 5. 6. 7.]
[ 4. 5. 6. 7. 8.]]
[[ 0. 0. 0. 0. 0.]
[ 0. 1. 2. 3. 4.]
[ 0. 2. 4. 6. 8.]
[ 0. 3. 6. 9. 12.]
[ 0. 4. 8. 12. 16.]]]
Parallel nested loop assignment using a single process:
[[[ 0. 1. 2. 3. 4.]
[ 1. 2. 3. 4. 5.]
[ 2. 3. 4. 5. 6.]
[ 3. 4. 5. 6. 7.]
[ 4. 5. 6. 7. 8.]]
[[ 0. 0. 0. 0. 0.]
[ 0. 1. 2. 3. 4.]
[ 0. 2. 4. 6. 8.]
[ 0. 3. 6. 9. 12.]
[ 0. 4. 8. 12. 16.]]]
Parallel nested loop assignment using multiple process:
[[[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]]
From Google and StackOverflow search function it appears that when using joblib, the global array isn't shared between each subprocess. I'm uncertain if this is a limitation of joblib or if there is a way to get around this?
In reality my script is surrounded by other bits of code that rely on the final output of this global array being in a (4,x,x) format where x is variable (but typically ranges in the 100s to several thousands). This is my current reason for looking at parallel processing as the whole process can take up to 2 hours for x = 2400.
The use of joblib isn't necessary (but I like the nomenclature and simplicity) so feel free to suggest simple alternative methods, ideally keeping in mind the requirements of the final array. I'm using python 2.7.3, and joblib 0.7.1.
I was able to resolve the issues with this simple example using numpy's memmap. I was still having issues after using memmap and following the examples on the joblib documentation webpage but I upgraded to the latest joblib version (0.9.3) via pip and it all runs smoothly. Here is the working code:
from joblib import Parallel, delayed
import numpy as np
import os
import tempfile
import shutil
def main():
print "Nested loop array assignment:"
regular()
print "Parallel nested loop assignment using numpy's memmap:"
par3(4)
def regular():
# Define variables
a = [0,1,2,3,4]
b = [0,1,2,3,4]
# Set array variable to global and define size and shape
global ab
ab = np.zeros((2,np.size(a),np.size(b)))
# Iterate to populate array
for i in range(0,np.size(a)):
for j in range(0,np.size(b)):
func(i,j,a,b)
# Show array output
print ab
def par3(process):
# Creat a temporary directory and define the array path
path = tempfile.mkdtemp()
ab3path = os.path.join(path,'ab3.mmap')
# Define variables
a3 = [0,1,2,3,4]
b3 = [0,1,2,3,4]
# Create the array using numpy's memmap
ab3 = np.memmap(ab3path, dtype=float, shape=(2,np.size(a3),np.size(b3)), mode='w+')
# Parallel process in order to populate array
Parallel(n_jobs=process)(delayed(func3)(i,a3,b3,ab3) for i in xrange(0,np.size(a3)))
# Show array output
print ab3
# Delete the temporary directory and contents
try:
shutil.rmtree(path)
except:
print "Couldn't delete folder: "+str(path)
def func(i,j,a,b):
# Populate array
ab[0,i,j] = a[i]+b[j]
ab[1,i,j] = a[i]*b[j]
def func3(i,a3,b3,ab3):
# Populate array
for j in range(0,np.size(b3)):
ab3[0,i,j] = a3[i]+b3[j]
ab3[1,i,j] = a3[i]*b3[j]
# Run script
main()
Giving the following results:
Nested loop array assignment:
[[[ 0. 1. 2. 3. 4.]
[ 1. 2. 3. 4. 5.]
[ 2. 3. 4. 5. 6.]
[ 3. 4. 5. 6. 7.]
[ 4. 5. 6. 7. 8.]]
[[ 0. 0. 0. 0. 0.]
[ 0. 1. 2. 3. 4.]
[ 0. 2. 4. 6. 8.]
[ 0. 3. 6. 9. 12.]
[ 0. 4. 8. 12. 16.]]]
Parallel nested loop assignment using numpy's memmap:
[[[ 0. 1. 2. 3. 4.]
[ 1. 2. 3. 4. 5.]
[ 2. 3. 4. 5. 6.]
[ 3. 4. 5. 6. 7.]
[ 4. 5. 6. 7. 8.]]
[[ 0. 0. 0. 0. 0.]
[ 0. 1. 2. 3. 4.]
[ 0. 2. 4. 6. 8.]
[ 0. 3. 6. 9. 12.]
[ 0. 4. 8. 12. 16.]]]
A few of my thoughts to note for any future readers:
np.arange(0,10000)
, and b and b3 to np.arange(0,1000)
gave
times of 12.4s for the "regular" method and 7.7s for the joblib
method.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With