Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Run separate processes in parallel - Python

I use the python 'multiprocessing' module to run single processes on multiple cores but I want to run a couple of independent processes in parallel.

For example, Process-one parses large files, Process-two find patterns in different files and process three does some calculation; can all these three different processed that have different sets of arguments be run in parallel?

def Process1(largefile):
    Parse large file
    runtime 2hrs
    return parsed_file

def Process2(bigfile)
    Find pattern in big file
    runtime 2.5 hrs
    return pattern

def Process3(integer)
    Do astronomical calculation
    Run time 2.25 hrs
    return calculation_results

def FinalProcess(parsed,pattern,calc_results):
    Do analysis
    Runtime 10 min
    return final_results

def main():
parsed = Process1(largefile)
pattern = Process2(bigfile)
calc_res = Process3(integer)
Final = FinalProcess(parsed,pattern,calc_res)

if __name__ == __main__:
    main()
    sys.exit()

In the above pseudo-code Process1, Process2 and Process3 are single-core processes i.e they can't be run on multiple processors. These processes are run sequentially and take 2+2.5+2.25hrs = 6.75 hrs. Is it possible to run these three processes in parallel? So that they run at the same time on different processors/cores and when most time taking (Process2) finishes than we move to Final Process.

like image 839
Bade Avatar asked Sep 29 '13 17:09

Bade


People also ask

How do you run the same method parallel in Python?

We can also run the same function in parallel with different parameters using the Pool class. For parallel mapping, We have to first initialize multiprocessing. Pool() object. The first argument is the number of workers; if not given, that number will be equal to the number of elements in the system.

Does Python support parallelism?

Python provides mechanisms for both concurrency and parallelism, each with its own syntax and use cases. Python has two different mechanisms for implementing concurrency, although they share many common components. These are threading and coroutines, or async.


1 Answers

From 16.6.1.5. Using a pool of workers:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)              # start 4 worker processes
    result = pool.apply_async(f, [10])    # evaluate "f(10)" asynchronously
    print result.get(timeout=1)           # prints "100" unless your computer is *very* slow
    print pool.map(f, range(10))          # prints "[0, 1, 4,..., 81]"

You can, therefore, apply_async against a pool and get your results after everything is ready.

from multiprocessing import Pool

# all your methods declarations above go here
# (...)

def main():
    pool = Pool(processes=3)
    parsed = pool.apply_async(Process1, [largefile])
    pattern = pool.apply_async(Process2, [bigfile])
    calc_res = pool.apply_async(Process3, [integer])

    pool.close()
    pool.join()

    final = FinalProcess(parsed.get(), pattern.get(), calc_res.get())

# your __main__ handler goes here
# (...)
like image 114
planestepper Avatar answered Oct 27 '22 06:10

planestepper