Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which loops should I parallelize, the outer or the inner ones

I am writing an image processing filter, and I want to speed up the computations using openmp. My pseudo-code structure follows like this:

for(every pixel in the image){
    //do some stuff here
    for(any combination of parameters){
        //do other stuff here and filter
    }
}

The code is filtering every pixel using different parameters, and choosing the optimal ones.

My question is what is faster: to parallelize the first loop among the processors, or to access sequentially the pixels and parallelize the different parameters selection.

I think the question could be a more general one: what is faster, giving big amounts of operations to every thread, or creating many threads with few operations.

I don't care for now about the implementation details, and I think I can handle them with my previous expertise using openmp. Thanks!

like image 904
Anthony Avatar asked Dec 19 '22 18:12

Anthony


1 Answers

Your goal is to distribute the data evenly over the available processors. You should split the image up (outer loop) evenly with one thread per processor core. Experiment with fine and coarse grain parallelism to see what gives the best results. Once your number of threads exceed the number of cores available you will start to see performance degradation.

like image 107
Eamonn McEvoy Avatar answered Jan 01 '23 12:01

Eamonn McEvoy