Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Returning multiple lists from pool.map processes?

Win 7, x64, Python 2.7.12

In the following code I am setting off some pool processes to do a trivial multiplication via the multiprocessing.Pool.map() method. The output data is collected in List_1.

NOTE: this is a stripped down simplification of my actual code. There are multiple lists involved in the real application, all huge.

import multiprocessing
import numpy as np

def createLists(branches):

    firstList = branches[:] * node

    return firstList


def init_process(lNodes):

    global node
    node = lNodes
    print 'Starting', multiprocessing.current_process().name


if __name__ == '__main__':

    mgr = multiprocessing.Manager()
    nodes = mgr.list()
    pool_size = multiprocessing.cpu_count()

    branches = [i for i in range(1, 21)]
    lNodes = 10
    splitBranches = np.array_split(branches, int(len(branches)/pool_size))

    pool = multiprocessing.Pool(processes=pool_size, initializer=init_process, initargs=[lNodes])
    myList_1 = pool.map(createLists, splitBranches)

    pool.close() 
    pool.join()  

I now add an extra calculation to createLists() & try to pass back both lists.

import multiprocessing
import numpy as np

def createLists(branches):

    firstList = branches[:] * node
    secondList = branches[:] * node * 2

    return firstList, secondList


def init_process(lNodes):
    global node
    node = lNodes
    print 'Starting', multiprocessing.current_process().name


if __name__ == '__main__':

    mgr = multiprocessing.Manager()
    nodes = mgr.list()
    pool_size = multiprocessing.cpu_count()

    branches = [i for i in range(1, 21)]
    lNodes = 10
    splitBranches = np.array_split(branches, int(len(branches)/pool_size))

    pool = multiprocessing.Pool(processes=pool_size, initializer=init_process, initargs=[lNodes])
    myList_1, myList_2 = pool.map(createLists, splitBranches)

    pool.close() 
    pool.join() 

This raises the follow error & traceback..

Traceback (most recent call last):

  File "<ipython-input-6-ff188034c708>", line 1, in <module>
    runfile('C:/Users/nr16508/Local Documents/Inter Trab Angle/Parallel/scratchpad.py', wdir='C:/Users/nr16508/Local Documents/Inter Trab Angle/Parallel')

  File "C:\Users\nr16508\AppData\Local\Continuum\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
    execfile(filename, namespace)

  File "C:\Users\nr16508\AppData\Local\Continuum\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 87, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "C:/Users/nr16508/Local Documents/Inter Trab Angle/Parallel/scratchpad.py", line 36, in <module>
    myList_1, myList_2 = pool.map(createLists, splitBranches)

ValueError: too many values to unpack

When I tried to put both list into one to pass back ie...

return [firstList, secondList]
......
myList = pool.map(createLists, splitBranches)

...the output becomes too jumbled for further processing.

Is there an method of collecting more than one list from pooled processes?

like image 612
DrBwts Avatar asked Oct 06 '16 10:10

DrBwts


1 Answers

This question has nothing to do with multiprocessing or threadpooling. It is simply about how to unzip lists, which can be done with the standard zip(*...) idiom.

myList_1, myList_2 = zip(*pool.map(createLists, splitBranches))
like image 132
donkopotamus Avatar answered Nov 15 '22 21:11

donkopotamus