Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python, parallelization with joblib: Delayed with multiple arguments

I am using something similar to the following to parallelize a for loop over two matrices

from joblib import Parallel, delayed
import numpy

def processInput(i,j):
    for k in range(len(i)):
        i[k] = 1
    for t in range(len(b)):
        j[t] = 0
return i,j

a = numpy.eye(3)
b = numpy.eye(3)

num_cores = 2
(a,b) = Parallel(n_jobs=num_cores)(delayed(processInput)(i,j) for i,j in zip(a,b))

but I'm getting the following error: Too many values to unpack (expected 2)

Is there a way to return 2 values with delayed? Or what solution would you propose?

Also, a bit OP, is there a more compact way, like the following (which doesn't actually modify anything) to process the matrices?

from joblib import Parallel, delayed
def processInput(i,j):
    for k in i:
        k = 1
    for t in b:
        t = 0
return i,j

I would like to avoid the use of has_shareable_memory anyway, to avoid possible bad interactions in the actual script and lower performances(?)

like image 900
Francesco Avatar asked Oct 21 '16 21:10

Francesco


People also ask

Does joblib use multiprocessing?

Old multiprocessing backendPrior to version 0.12, joblib used the 'multiprocessing' backend as default backend instead of 'loky' . This backend creates an instance of multiprocessing. Pool that forks the Python interpreter in multiple processes to execute each of the items of the list.

Does joblib parallel preserve order?

TL;DR - it preserves order for both backends.

What is the purpose of using joblib in Jupyter toolkit?

Joblib provides a better way to avoid recomputing the same function repetitively saving a lot of time and computational cost. For example, let's take a simple example below: As seen above, the function is simply computing the square of a number over a range provided.

What is joblib dump?

By default, joblib.dump() uses the zlib compression method as it gives the best tradeoff between speed and disk space. The other supported compression methods are 'gzip', 'bz2', 'lzma' and 'xz': >>> # Dumping in a gzip compressed file using a compress level of 3. >>> joblib.


1 Answers

Probably too late, but as an answer to the first part of your question: Just return a tuple in your delayed function.

return (i,j)

And for the variable holding the output of all your delayed functions

results = Parallel(n_jobs=num_cores)(delayed(processInput)(i,j) for i,j in zip(a,b))

Now results is a list of tuples each holding some (i,j) and you can just iterate through results.

like image 112
qazplok11 Avatar answered Sep 28 '22 04:09

qazplok11