Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Automatic parallelization

What is your opinion regarding a project that will try to take a code and split it to threads automatically(maybe compile time, probably in runtime).

Take a look at the code below:

for(int i=0;i<100;i++)
   sum1 += rand(100)
for(int j=0;j<100;j++)
   sum2 += rand(100)/2

This kind of code can automatically get split to 2 different threads that run in parallel. Do you think it's even possible? I have a feeling that theoretically it's impossible (it reminds me the halting problem) but I can't justify this thought.

Do you think it's a useful project? is there anything like it?

like image 208
DuduAlul Avatar asked Jul 24 '10 20:07

DuduAlul


2 Answers

This is called automatic parallelization. If you're looking for some program you can use that does this for you, it doesn't exist yet. But it may eventually. This is a hard problem and is an area of active research. If you're still curious...

It's possible to automatically split your example into multiple threads, but not in the way you're thinking. Some current techniques try to run each iteration of a for-loop in its own thread. One thread would get the even indicies (i=0, i=2, ...), the other would get the odd indices (i=1, i=3, ...). Once that for-loop is done, the next one could be started. Other techniques might get crazier, executing the i++ increment in one thread and the rand() on a separate thread.

As others have pointed out, there is a true dependency between iterations because rand() has internal state. That doesn't stand in the way of parallelization by itself. The compiler can recognize the memory dependency, and the modified state of rand() can be forwarded from one thread to the other. But it probably does limit you to only a few parallel threads. Without dependencies, you could run this on as many cores as you had available.

If you're truly interested in this topic and don't mind sifting through research papers:

  1. Automatic thread extraction with decoupled software pipelining (2005) by G. Ottoni.
  2. Speculative parallelization using software multi-threaded transactions (2010) by A. Raman.
like image 102
Karmastan Avatar answered Oct 16 '22 07:10

Karmastan


This is practically not possible.

The problem is that you need to know, in advance, a lot more information than is readily available to the compiler, or even the runtime, in order to parallelize effectively.

While it would be possible to parallelize very simple loops, even then, there's a risk involved. For example, your above code could only be parallelized if rand() is thread-safe - and many random number generation routines are not. (Java's Math.random() is synchronized for you - however.)

Trying to do this type of automatic parallelization is, at least at this point, not practical for any "real" application.

like image 29
Reed Copsey Avatar answered Oct 16 '22 06:10

Reed Copsey