Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parfor for Python

I am looking for a definitive answer to MATLAB's parfor for Python (Scipy, Numpy).

Is there a solution similar to parfor? If not, what is the complication for creating one?

UPDATE: Here is a typical numerical computation code that I need speeding up

import numpy as np  N = 2000 output = np.zeros([N,N]) for i in range(N):     for j in range(N):         output[i,j] = HeavyComputationThatIsThreadSafe(i,j) 

An example of a heavy computation function is:

import scipy.optimize  def HeavyComputationThatIsThreadSafe(i,j):     n = i * j      return scipy.optimize.anneal(lambda x: np.sum((x-np.arange(n)**2)), np.random.random((n,1)))[0][0,0] 
like image 753
Dat Chu Avatar asked Jan 13 '11 16:01

Dat Chu


People also ask

Does Python have Parfor?

Used to parallelize for-loops using parfor in Matlab? This package allows you to do the same in python. Take any normal serial but parallelizable for-loop and execute it in parallel using easy syntax.

How do you parallelize a code in Python?

First, we create two Process objects and assign them the function they will execute when they start running, also known as the target function. Second, we tell the processes to go ahead and run their tasks. And third, we wait for the processes to finish running, then continue with our program.


2 Answers

The one built-in to python would be multiprocessing docs are here. I always use multiprocessing.Pool with as many workers as processors. Then whenever I need to do a for-loop like structure I use Pool.imap

As long as the body of your function does not depend on any previous iteration then you should have near linear speed-up. This also requires that your inputs and outputs are pickle-able but this is pretty easy to ensure for standard types.

UPDATE: Some code for your updated function just to show how easy it is:

from multiprocessing import Pool from itertools import product  output = np.zeros((N,N)) pool = Pool() #defaults to number of available CPU's chunksize = 20 #this may take some guessing ... take a look at the docs to decide for ind, res in enumerate(pool.imap(Fun, product(xrange(N), xrange(N))), chunksize):     output.flat[ind] = res 
like image 156
JudoWill Avatar answered Sep 23 '22 02:09

JudoWill


There are many Python frameworks for parallel computing. The one I happen to like most is IPython, but I don't know too much about any of the others. In IPython, one analogue to parfor would be client.MultiEngineClient.map() or some of the other constructs in the documentation on quick and easy parallelism.

like image 45
Sven Marnach Avatar answered Sep 22 '22 02:09

Sven Marnach