Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implement Parallel for loops in Python

I have a Python program which looks like this:

total_error = []
for i in range(24):
    error = some_function_call(parameters1, parameters2)
    total_error += error

The function 'some_function_call' takes a lot of time and I can't find an easy way to reduce time complexity of the function. Is there a way to still reduce the execution time while performing parallel tasks and later adding them up in total_error. I tried using pool and joblib but could not successfully use either.

like image 415
thechargedneutron Avatar asked Jan 03 '18 17:01

thechargedneutron


People also ask

How do you process a parallel loop in Python?

Parallel For-Loop with map() First, we can create the multiprocessing pool configured by default to use all available CPU cores. Next, we can call the map() function as before and iterate the result values, although this time the map() function is a method on the multiprocessing pool.

Can FOR loops be parallel?

Can any for loop be made parallel? No, not any loop can be made parallel. Iterations of the loop must be independent from each other. That is, one cpu core should be able to run one iteration without any side effects to another cpu core running a different iteration.

Is parallel programming possible in Python?

There are several common ways to parallelize Python code. You can launch several application instances or a script to perform jobs in parallel. This approach is great when you don't need to exchange data between parallel jobs.


2 Answers

You can use python multiprocessing:

from multiprocessing import Pool, freeze_support, cpu_count
import os

all_args = [(parameters1, parameters2) for i in range(24)]

# call freeze_support() if in Windows
if os.name == "nt":
    freeze_support()

# you can use whatever, but your machine core count is usually a good choice (although maybe not the best)
pool = Pool(cpu_count()) 

def wrapped_some_function_call(args): 
    """
    we need to wrap the call to unpack the parameters 
    we build before as a tuple for being able to use pool.map
    """ 
    sume_function_call(*args) 

results = pool.map(wrapped_some_function_call, all_args)
total_error = sum(results)
like image 75
Netwave Avatar answered Sep 16 '22 11:09

Netwave


You can also use concurrent.futures in Python 3, which is a simpler interface than multiprocessing. See this for more details about differences.

from concurrent import futures

total_error = 0

with futures.ProcessPoolExecutor() as pool:
  for error in pool.map(some_function_call, parameters1, parameters2):
    total_error += error

In this case, parameters1 and parameters2 should be a list or iterable of the same size as the number of times you want to run the function (24 times as per your example).

If paramters<1,2> are not iterables/mappable, but you just want to run the function 24 times, you can submit the jobs for the function for the required number of times, and later acquire the result using a callback.

class TotalError:
    def __init__(self):
        self.value = 0

    def __call__(self, r):
        self.value += r.result()

total_error = TotalError()
with futures.ProcessPoolExecutor() as pool:
  for i in range(24):
    future_result = pool.submit(some_function_call, parameters1, parameters2)
    future_result.add_done_callback(total_error)

print(total_error.value)
like image 30
Gerges Avatar answered Sep 18 '22 11:09

Gerges