Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiprocessing incompatible with NumPy [duplicate]

I am trying to run a simple test using multiprocessing. The test works well until I import numpy (even though it is not used in the program). Here is the code:

from multiprocessing import Pool
import time
import numpy as np #this is the problematic line


def CostlyFunc(N):
    """"""
    tstart = time.time()
    x = 0
    for i in xrange(N):
        for j in xrange(N):
            if i % 2: x += 2
            else: x -= 2       
    print "CostlyFunc : elapsed time %f s" % (time.time() - tstart)
    return x

#serial application
ResultList0 = []
StartTime = time.time()
for i in xrange(3):
    ResultList0.append(CostlyFunc(5000))
print "Elapsed time (serial) : ", time.time() - StartTime


#multiprocessing application
StartTime = time.time()
pool = Pool()
asyncResult = pool.map_async(CostlyFunc, [5000, 5000, 5000])
ResultList1 = asyncResult.get()
print "Elapsed time (multiporcessing) : ", time.time() - StartTime

If I don't import numpy the result is:

CostlyFunc : elapsed time 2.866265 s
CostlyFunc : elapsed time 2.793213 s
CostlyFunc : elapsed time 2.794936 s
Elapsed time (serial) :  8.45455098152
CostlyFunc : elapsed time 2.889815 s
CostlyFunc : elapsed time 2.891556 s
CostlyFunc : elapsed time 2.898898 s
Elapsed time (multiporcessing) :  2.91595196724

The total elapsed time is similar to the time required for 1 process, meaning that the computation has been parallelized. If I do import numpy the result becomes :

CostlyFunc : elapsed time 2.877116 s
CostlyFunc : elapsed time 2.866778 s
CostlyFunc : elapsed time 2.860894 s
Elapsed time (serial) :  8.60492110252
CostlyFunc : elapsed time 8.450145 s
CostlyFunc : elapsed time 8.473006 s
CostlyFunc : elapsed time 8.506402 s
Elapsed time (multiporcessing) :  8.55398178101

The total time elapsed is the same for both serial and multiprocessing methods because only one core is used. It is clear that the problem comes from numpy. Is it possible that I have an incompatibility between my versions of multiprocessing and NumPy?

I am currently using Python2.7, NumPy 1.6.2 and multiprocessing 0.70a1 on linux

like image 518
user2660966 Avatar asked Aug 07 '13 14:08

user2660966


2 Answers

(First Post sorry if it is not well formulated or alligned)

You can stop Numpy to use multithreading by seting the MKL_NUM_THREADS to 1

Under debian I used:

export MKL_NUM_THREADS=1

Source from related stackoverflow post: Python: How do you stop numpy from multithreading?

Result:

user@pc:~/tmp$ python multi.py
CostlyFunc : elapsed time 3.847009 s
CostlyFunc : elapsed time 3.253226 s
CostlyFunc : elapsed time 3.415734 s
Elapsed time (serial) :  10.5163660049
CostlyFunc : elapsed time 4.218424 s
CostlyFunc : elapsed time 5.252429 s
CostlyFunc : elapsed time 4.862513 s
Elapsed time (multiporcessing) :  9.11713695526

user@pc:~/tmp$ export MKL_NUM_THREADS=1

user@pc:~/tmp$ python multi.py
CostlyFunc : elapsed time 3.014677 s
CostlyFunc : elapsed time 3.102548 s
CostlyFunc : elapsed time 3.060915 s
Elapsed time (serial) :  9.17840886116
CostlyFunc : elapsed time 3.720322 s
CostlyFunc : elapsed time 3.950583 s
CostlyFunc : elapsed time 3.656165 s
Elapsed time (multiporcessing) :  7.399310112

I am not sure if that helps because I guess eventually you want numpy to run in parallel maybe try to adjust the number of threads for numpy to your machine.

like image 113
addy Avatar answered Nov 10 '22 12:11

addy


From the comments on your question, have a look at that link @Ophion no, but I have flagged it as a duplicate of Why does multiprocessing use only a single core after I import numpy? – ali_m Aug 22 at 9:06

I would check to see if you are using an optimized version of BLAS. I have found that some generic installs of numpy do not deliver and optimized version of this lib. From my install you can note that is points to libf77blas.so, libcblas.so, libatlas.so.

Here are the instructions to build an optimized version of BLAS: http://docs.scipy.org/doc/numpy/user/install.html

From with in python:

import numpy.core._dotblas

>>> numpy.core._dotblas.__file__

## output:

'PYTHONHOME/lib/python2.7/site-packages/numpy/core/_dotblas.so'

From your terminal:

$ ldd 'PYTHONHOME/lib/python2.7/site-packages/numpy/core/_dotblas.so'
linux-vdso.so.1 =>  (0x00007fff241ff000)
libf77blas.so => /opt/arch/intel/lib/libf77blas.so (0x00007f6050647000)
libcblas.so => /opt/arch/intel/lib/libcblas.so (0x00007f6050429000)
libatlas.so => /opt/arch/intel/lib/libatlas.so (0x00007f604fbf1000)
libpython2.7.so.1.0 => 'PYTHONHOME/lib/libpython2.7.so.1.0 (0x00007f604f817000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f604f5f9000)
libc.so.6 => /lib64/libc.so.6 (0x00007f604f266000)
libgfortran.so.3 => /usr/lib64/libgfortran.so.3 (0x00007f604ef74000)
libm.so.6 => /lib64/libm.so.6 (0x00007f604ecef000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f604eaeb000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f604e8e8000)

/lib64/ld-linux-x86-64.so.2 (0x0000003c75e00000)

like image 24
onzyone Avatar answered Nov 10 '22 10:11

onzyone