OpenCL: basic questions about SIMT execution model

Question

Some of the concepts and designs of the "SIMT" architecture are still unclear to me.

From what I've seen and read, diverging code paths and if() altogether are a rather bad idea, because many threads might execute in lockstep. Now what does that exactly mean? What about something like:

kernel void foo(..., int flag)
{
    if (flag)
        DO_STUFF
    else
        DO_SOMETHING_ELSE
}

The parameter "flag" is the same for all work units and the same branch is taken for all work units. Now, is a GPU going to execute all of the code, serializing everything nonetheless and basically still taking the branch that is not taken? Or is it a bit more clever and will only execute the branch taken, as long as all threads agree on the branch taken? Which would always be the case here.

I.e. does serialization ALWAYS happen or only if needed? Sorry for the stupid question. ;)

Dr. Snoopy · Accepted Answer

No, doesn´t happen always. Executing both branches happens only if the condition is not coherent between threads in a local work group, that means if the condition evaluates to different values between work items in a local work group, current generation GPUs will execute both branches, but only the correct branches will write values and have side effects.

So, maintaining coherency is vital to performance in GPU branches.

OpenCL: basic questions about SIMT execution model

Tags:

parallel-processing

gpgpu

gpu

opencl

dietr

1 Answers

Dr. Snoopy

Recent Activity

Donate For Us

OpenCL: basic questions about SIMT execution model

Tags:

parallel-processing

gpgpu

gpu

opencl

dietr

1 Answers

Dr. Snoopy

Related questions

Recent Activity

Donate For Us