I have 2 very similar kernel functions, in the sense that the code is nearly the same, but with a slight difference. Currently I have 2 options:
How much will an if statement affect my algorithm performance?
I know that there is no branching, since all threads in all blocks will enter either the if, or the else.
So will a single if statement decrease my performance if the kernel function is called a lot of times?
You have a third alternative, which is to use C++ templating and make the variable which is used in the if/switch statement a template parameter. Instantiate each version of the kernel you need, and then you have multiple kernels doing different things with no branch divergence or conditional evaluation to worry about, because the compiler will optimize away the dead code and the branching with it.
Perhaps something like this:
template<int action>
__global__ void kernel()
{
switch(action) {
case 1:
// First code
break;
case 2:
// Second code
break;
}
}
template void kernel<1>();
template void kernel<2>();
It will slightly decrease your performance, especially if it's in an inner loop, since you're wasting an instruction issue slot every so often, but it's not nearly as much as if a warp were divergent.
If it's a big deal, it may be worth moving the condition outside the loop, however. If the warp is truly divergent, though, think about how to remove the branching: e.g., instead of
if (i>0) {
x = 3;
} else {
x = y;
}
try
x = ((i>0)*3) | ((i<3)*y);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With