Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Openmp thread divergence?

The term thread divergence is used in CUDA; from my understanding it's a situation where different threads are assigned to do different tasks and this results in a big performance hit.

I was wondering, is there a similar penalty for doing this in openmp? For example, say I have a 6 core processor and a program with 6 threads. If I have a conditional that makes 3 threads perform a certain task, and then have the other three threads perform a completely different task, will there be a big performance hit? I guess in essence it's sort of using openmp to do MIMD.

Basically, I'm writing a program with openmp and CUDA. I want two threads to run a CUDA kernel while the other left over threads run C code. Thanks.

like image 625
JustinBlaber Avatar asked Apr 20 '12 16:04

JustinBlaber


3 Answers

No, there is no performance hit for diverging threads using OpenMP. It is a problem in CUDA because of the way instructions are broadcast simultaneously to a set of cores. When an OpenMP thread targets a CPU core, each CPU core has its own independent set of instructions to follow, and it runs just like any other single-threaded program would.

You may see some of your cores being underutilized if you have synchronization barriers following thread divergence, because that would force faster threads to wait for the slower threads to catch up.

like image 145
Brendan Wood Avatar answered Oct 31 '22 18:10

Brendan Wood


When speaking about CPU parallelism, there's no intrinsic performance hit from using a certain threading design pattern. Not at the theoretical level at least.

The only problem I see is that since the threads are doing different things which may have varying completion times, some of the threads may sit idle after finishing their work, waiting for the others to finish a longer task.

like image 25
Tudor Avatar answered Oct 31 '22 16:10

Tudor


The term thread divergence in CUDA refers to the situation when not all threads of a bock evaluate a conditional with the same outcome. Such threads are said to diverge. If diverging threads are in the same warp then such threads may perform work serially which leads to performance loss.

I am not sure that OpenMP has the same issue, though. When different threads perform different work then load balancing may be used by the runtime perhaps, but it doesn't lead to work serialization necessarily.

like image 36
Vinod Grover Avatar answered Oct 31 '22 18:10

Vinod Grover