Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Multiprocessing: What's the difference between map and imap?

I'm trying to learn how to use Python's multiprocessing package, but I don't understand the difference between map and imap.

Is the difference that map returns, say, an actual array or set, while imap returns an iterator over an array or set? When would I use one over the other?

Also, I don't understand what the chunksize argument is. Is this the number of values that are passed to each process?

like image 489
grautur Avatar asked Jul 05 '12 04:07

grautur


People also ask

What is the difference between MAP and IMAP?

MAP vs.An iMAP is specific to prices advertised online, but an eMAP includes all electronic communication channels—even text messaging. MAP pricing is broader than these two policies, and covers all print, physical, and digital channels a retailer may use to list pricing.

What does pool IMAP do?

We can issue tasks to the process pool one-by-one via the imap() function. The imap() function takes the name of a target function and an iterable. A task is created to call the target function for each item in the provided iterable. It returns an iterable over the return values from each call to the target function.

Is map Async Python?

It supports asynchronous results with timeouts and callbacks and has a parallel map implementation. The built-in map() function allows you to apply a function to each item in an iterable. The Python process pool provides an asynchronous parallel version of the map_async() function.

What is Chunksize in multiprocessing?

It is the single execution of the function specified with the func -parameter of a Pool -method, called with arguments obtained from a single element of the transmitted chunk. A task consists of chunksize taskels.


2 Answers

That is the difference. One reason why you might use imap instead of map is if you wanted to start processing the first few results without waiting for the rest to be calculated. map waits for every result before returning.

As for chunksize, it is sometimes more efficient to dole out work in larger quantities because every time the worker requests more work, there is IPC and synchronization overhead.

like image 180
Antimony Avatar answered Sep 21 '22 14:09

Antimony


imap is from itertools module which is used for fast and memory efficiency in python.Map will return the list where as imap returns the object which generates the values for each iterations(In python 2.7).The below code blocks will clear the difference.

Map returns the list can be printed directly

 from itertools import *     from math import *      integers = [1,2,3,4,5]     sqr_ints = map(sqrt, integers)     print (sqr_ints) 

imap returns object which is converted to list and printed.

from itertools import * from math import *  integers = [1,2,3,4,5] sqr_ints = imap(sqrt, integers) print list(sqr_ints) 

Chunksize will make the iterable to be split into pieces of specified size(approximate) and each piece is submitted as a separate task.

like image 30
Chandan Avatar answered Sep 21 '22 14:09

Chandan