Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Control flow divergence in SIMT and SIMD

Tags:

cuda

simd

sse

I am reading this book to study the concepts of CUDA in depth. In one of the chapters, which introduces the concept of SIMT it says

The option for control flow divergence in SIMT also simplifies the requirement for programmers to use extra instructions to handle control flow compared to SSE.

I know this statement is made based on the fact that SSE works on SIMD implementation technique and CUDA threads work on the principle of SIMT, but can anyone elaborate/explain on this sentence using some example. Thanks in advance.

like image 880
Recker Avatar asked Dec 31 '12 07:12

Recker


1 Answers

With SIMD if you have a routine where some elements need to be handled differently from other elements, then you need to explicltly take care of masking operations so that they are only applied to the correct elements. With CUDA's SIMT architecture you get the illusion of control flow on each thread, so you don't need explicit masking of operations - this still happens "under the hood" of course, but the burden is lifted from the programmer.

Example: suppose you want to set all negative elements to zero. In CUDA:

if (X[tid] < 0)
    X[tid] = 0;    // NB: CUDA core steps through this instruction but only executes
                   //     it if the preceding condition was true

In SIMD (SSE):

__m128 mask = _mm_cmpge_ps(X, _mm_set1_ps(0));  // generate mask for all elements >= 0
X = _mm_and_ps(X, mask);                        // clear all elements which are < 0
like image 174
Paul R Avatar answered Oct 09 '22 00:10

Paul R