map_async is non-blocking where as map is blocking. So let's say you had a function from multiprocessing import Pool import time def f(x): print x*x if __name__ == '__main__': pool = Pool(processes=4) pool.
imap() is almost the same as the pool. map() method. The difference is that the result of each item is received as soon as it is ready, instead of waiting for all of them to be finished. Moreover, the map() method converts the iterable into a list (if it is not). However, the imap() method does not.
That is, if you have operations that can take very different amounts of time (rather than the consistent 0.01 seconds you were using in your example), imap_unordered can smooth things out by yielding faster-calculated values ahead of slower-calculated values.
As we have seen, the Process allocates all the tasks in memory and Pool allocates only executing processes in memory, so when the task numbers is large, we can use Pool and when the task number is small, we can use Process class.
There are two key differences between imap
/imap_unordered
and map
/map_async
:
map
consumes your iterable by converting the iterable to a list (assuming it isn't a list already), breaking it into chunks, and sending those chunks to the worker processes in the Pool
. Breaking the iterable into chunks performs better than passing each item in the iterable between processes one item at a time - particularly if the iterable is large. However, turning the iterable into a list in order to chunk it can have a very high memory cost, since the entire list will need to be kept in memory.
imap
doesn't turn the iterable you give it into a list, nor does break it into chunks (by default). It will iterate over the iterable one element at a time, and send them each to a worker process. This means you don't take the memory hit of converting the whole iterable to a list, but it also means the performance is slower for large iterables, because of the lack of chunking. This can be mitigated by passing a chunksize
argument larger than default of 1, however.
The other major difference between imap
/imap_unordered
and map
/map_async
, is that with imap
/imap_unordered
, you can start receiving results from workers as soon as they're ready, rather than having to wait for all of them to be finished. With map_async
, an AsyncResult
is returned right away, but you can't actually retrieve results from that object until all of them have been processed, at which points it returns the same list that map
does (map
is actually implemented internally as map_async(...).get()
). There's no way to get partial results; you either have the entire result, or nothing.
imap
and imap_unordered
both return iterables right away. With imap
, the results will be yielded from the iterable as soon as they're ready, while still preserving the ordering of the input iterable. With imap_unordered
, results will be yielded as soon as they're ready, regardless of the order of the input iterable. So, say you have this:
import multiprocessing
import time
def func(x):
time.sleep(x)
return x + 2
if __name__ == "__main__":
p = multiprocessing.Pool()
start = time.time()
for x in p.imap(func, [1,5,3]):
print("{} (Time elapsed: {}s)".format(x, int(time.time() - start)))
This will output:
3 (Time elapsed: 1s)
7 (Time elapsed: 5s)
5 (Time elapsed: 5s)
If you use p.imap_unordered
instead of p.imap
, you'll see:
3 (Time elapsed: 1s)
5 (Time elapsed: 3s)
7 (Time elapsed: 5s)
If you use p.map
or p.map_async().get()
, you'll see:
3 (Time elapsed: 5s)
7 (Time elapsed: 5s)
5 (Time elapsed: 5s)
So, the primary reasons to use imap
/imap_unordered
over map_async
are:
The accepted answer states that for imap_unordered
"results will be yielded as soon as they're ready" where one might possibly infer that results will be returned in the order of completion. But I just want to make it clear that this is not true in general. The documentation states that the results are returned in arbitrary order. Consider the following program that uses a pool size of 4, an iterable size of 20 and a chunksize value of 5. The worker function sleeps a variable amount of time depending on its passed argument, which also ensures that no one process in the pool grabs all the submitted tasks. Thus I expect each process in the pool to have 20 / 4 = 5
tasks to process:
from multiprocessing import Pool
import time
def worker(x):
print(f'x = {x}', flush=True)
time.sleep(.1 * (20 - x))
# return approximate completion time with passed argument:
return time.time(), x
if __name__ == '__main__':
pool = Pool(4)
results = pool.imap_unordered(worker, range(20), chunksize=5)
for t, x in results:
print('result:', t, x)
Prints:
x = 0
x = 5
x = 10
x = 15
x = 16
x = 17
x = 11
x = 18
x = 19
x = 6
result: 1621512513.7737606 15
result: 1621512514.1747007 16
result: 1621512514.4758775 17
result: 1621512514.675989 18
result: 1621512514.7766125 19
x = 12
x = 1
x = 13
x = 7
x = 14
x = 2
result: 1621512514.2716103 10
result: 1621512515.1721854 11
result: 1621512515.9727488 12
result: 1621512516.6744206 13
result: 1621512517.276999 14
x = 8
x = 9
x = 3
result: 1621512514.7695887 5
result: 1621512516.170747 6
result: 1621512517.4713914 7
result: 1621512518.6734042 8
result: 1621512519.7743165 9
x = 4
result: 1621512515.268784 0
result: 1621512517.1698637 1
result: 1621512518.9698756 2
result: 1621512520.671273 3
result: 1621512522.2716706 4
You can plainly see that these results are not being yielded in completion order. For example, I have been returned 1621512519.7743165 9
followed by 1621512515.268784 0
, which was returned by the worker function more than 4 seconds earlier than the previously returned result. However, if I change the chunksize value to 1, the printout becmomes:
x = 0
x = 1
x = 2
x = 3
x = 4
result: 1621513028.888357 3
x = 5
result: 1621513028.9863524 2
x = 6
result: 1621513029.0838938 1
x = 7
result: 1621513029.1825204 0
x = 8
result: 1621513030.4842813 7
x = 9
result: 1621513030.4852195 6
x = 10
result: 1621513030.4872172 5
x = 11
result: 1621513030.4892178 4
x = 12
result: 1621513031.3908074 11
x = 13
result: 1621513031.4895358 10
x = 14
result: 1621513031.587289 9
x = 15
result: 1621513031.686152 8
x = 16
result: 1621513032.1877549 15
x = 17
result: 1621513032.1896958 14
x = 18
result: 1621513032.1923752 13
x = 19
result: 1621513032.1923752 12
result: 1621513032.2935638 19
result: 1621513032.3927407 18
result: 1621513032.4912949 17
result: 1621513032.5884912 16
This is in completion order. However, I hesitate to state that imap_unordered
will always return results as they become available if a chunksize value of 1 is specified, although that appears to be the case based on this experiment, since the documentation makes no such claim.
Discussion
When a chunksize of 5 is specified, the 20 tasks are placed on a single input queue for the 4 processes in the pool to process in chunks of size 5. So a process that becomes idle will take off the queue the next chunk of 5 tasks and process each one of them in turn before becoming idle again. Thus the first process will be processing x
arguments 0 through 4, the second process x
arguments 5 through 9, etc. This is why you see the initial x
values printed as 0, 5, 10 and 15.
But while the result for x
argument 0 completes before the result for x
argument 9, it would appear that results get written out together as chunks and therefore the result for x
argument 0 will not get returned until the results for the x
arguments that were queued up in the same chunk (i.e. 1, 2, 3 and 4) are also available.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With