I'm working on some research code which uses scipy.optimize.leastsq
to optimize a function. It does this about 18 times per iteration, so I would like call leastsq in parallel to reduce running time. This shouldn't be a problem because the optimizations are almost completely separate so very little synchronization is required. I recently found out about multiprocessing.pool.ThreadPool
which would allow me to do this without having to explicitly set up shared memory (a pain since most of my data are in NumPy arrays). So I made a slight rewrite of my code, hoping it would work, but it throws a strange error: SystemError: null argument to internal routine
.
The following is a simplification of my code:
def optfunc(id):
def errfunc(x):
return somedata[id] - somefunc(x)
lock.acquire()
x0 = numpy.copy(currentx[id])
lock.release()
result = scipy.optimize.leastsq(errfunc, x0)
lock.acquire()
currentx[id] = result
lock.release()
ThreadPool(processes=8).map(optfunc, range(idcount))
This should work fine, unless scipy.optimize.leastsq
isn't threadsafe. So I tried putting a lock around scipy.optimize.leastsq
; lo and behold it works. However, processor utilization is stuck at 100%, so this is useless to me.
My question is then what can I do about this? I think my options are:
Any help or suggestions would be greatly appreciated.
Using processes instead of threads will make a difference -- it will work whether the program is threadsafe or not. Of course, whether it is faster depends if the time taken in solving the problem is larger than the overhead.
Using processes may require some additional hassle with setting up all necessary data. The multiprocessing
module however takes care of most of the work, so that shouldn't be too difficult to do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With