Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I tell a multi-core / multi-CPU machine to process function calls in a loop in parallel?

I am currently designing an application that has one module which will load large amounts of data from a database and reduce it to a much smaller set by various calculations depending on the circumstances.

Many of the more intensive operations behave deterministically and would lend themselves to parallel processing.

Provided I have a loop that iterates over a large number of data chunks arriving from the db and for each one call a deterministic function without side effects, how would I make it so that the program does not wait for the function to return but rather sets the next calls going, so they could be processed in parallel? A naive approach to demonstrate the principle would do me for now.

I have read Google's MapReduce paper and while I could use the overall principle in a number of places, I won't, for now, target large clusters, rather it's going to be a single multi-core or multi-CPU machine for version 1.0. So currently, I'm not sure if I can actually use the library or would have to roll a dumbed-down basic version myself.

I am at an early stage of the design process and so far I am targeting C-something (for the speed critical bits) and Python (for the productivity critical bits) as my languages. If there are compelling reasons, I might switch, but so far I am contented with my choice.

Please note that I'm aware of the fact that it might take longer to retrieve the next chunk from the database than to process the current one and the whole process would then be I/O-bound. I would, however, assume for now that it isn't and in practice use a db cluster or memory caching or something else to be not I/O-bound at this point.

like image 408
Hanno Fietz Avatar asked Sep 11 '08 14:09

Hanno Fietz


2 Answers

Well, if .net is an option, they have put a lot of effort into Parallel Computing.

like image 128
EBGreen Avatar answered Sep 28 '22 01:09

EBGreen


If you still plan on using Python, you might want to have a look at Processing. It uses processes rather than threads for parallel computing (due to the Python GIL) and provides classes for distributing "work items" onto several processes. Using the pool class, you can write code like the following:

import processing

def worker(i):
    return i*i
num_workers = 2
pool = processing.Pool(num_workers)
result = pool.imap(worker, range(100000))

This is a parallel version of itertools.imap, which distributes calls over to processes. You can also use the apply_async methods of the pool and store lazy result objects in a list:

results = []
for i in range(10000):
    results.append(pool.apply_async(worker, i))

For further reference, see the documentation of the Pool class.

Gotchas:

  • processing uses fork(), so you have to be careful on Win32
  • objects transferred between processes need to be pickleable
  • if the workers are relatively fast, you can tweak chunksize, i.e. the number of work items send to a worker process in one batch
  • processing.Pool uses a background thread
like image 36
Torsten Marek Avatar answered Sep 28 '22 00:09

Torsten Marek