Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does joblib.Parallel keep the original order of data passed?

I want to ask the same question as Python 3: does Pool keep the original order of data passed to map? for joblib. E.g.:

Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in x)

The syntax kind of implied it but I am always worried about the ordering of output of parallel processing and I don't want to code base on undocumented behavior.

like image 678
user3226167 Avatar asked Jun 19 '19 02:06

user3226167


1 Answers

TL;DR - it preserves order for both backends.

Extending @Chris Farr's answer, I implemented a simple test. I make a function wait for some random amount of time (you can check these wait times are not identical). I get that the order is preserved every time, with both backends.

from joblib import Parallel, delayed
import numpy as np
import time

def f(wait):
    time.sleep(wait)
    return wait

n = 50
waits = np.random.uniform(low=0, high=1, size=n)
res = Parallel(n_jobs=8, backend='multiprocessing')(delayed(f)(wait) for wait in waits)
np.all(res == waits)
like image 80
Yair Daon Avatar answered Sep 23 '22 17:09

Yair Daon