Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallelizing four nested loops in Python

I have a fairly straightforward nested for loop that iterates over four arrays:

for a in a_grid:
    for b in b_grid:
        for c in c_grid:
            for d in d_grid:
                do_some_stuff(a,b,c,d)  # perform calculations and write to file

Maybe this isn't the most efficient way to perform calculations over a 4D grid to begin with. I know joblib is capable of parallelizing two nested for loops like this, but I'm having trouble generalizing it to four nested loops. Any ideas?

like image 365
ylangylang Avatar asked Feb 02 '17 22:02

ylangylang


Video Answer


3 Answers

I usually use code of this form:

#!/usr/bin/env python3
import itertools
import multiprocessing

#Generate values for each parameter
a = range(10)
b = range(10)
c = range(10)
d = range(10)

#Generate a list of tuples where each tuple is a combination of parameters.
#The list will contain all possible combinations of parameters.
paramlist = list(itertools.product(a,b,c,d))

#A function which will process a tuple of parameters
def func(params):
  a = params[0]
  b = params[1]
  c = params[2]
  d = params[3]
  return a*b*c*d

#Generate processes equal to the number of cores
pool = multiprocessing.Pool()

#Distribute the parameter sets evenly across the cores
res  = pool.map(func,paramlist)
like image 50
Richard Avatar answered Oct 18 '22 20:10

Richard


If you use a tool that makes it easy to parallelize two nested loops, but not four, you can use itertools.product to reduce four nested for loops into two:

from itertools import product

for a, b in product(a_grid, b_grid):
    for c, d in product(c_grid, d_grid):
        do_some_stuff(a, b, c, d)
like image 34
user4815162342 Avatar answered Oct 18 '22 21:10

user4815162342


The number of jobs is not related to the number of nested loops. In that other answer, it happened to be n_jobs=2 and 2 loops, but the two are completely unrelated.

Think of it this way: You have a bunch of function calls to make; in your case (unrolling the loops):

do_some_stuff(0,0,0,0)
do_some_stuff(0,0,0,1)
do_some_stuff(0,0,0,2)
do_some_stuff(0,0,1,0)
do_some_stuff(0,0,1,1)
do_some_stuff(0,0,1,2)
...

and you want to distribute those function calls across some number of jobs. You could use 2 jobs, or 10, or 100, it doesn't matter. Parallel takes care of distributing the work for you.

like image 2
jwd Avatar answered Oct 18 '22 22:10

jwd