Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dot product vs Direct vector components sum performance in shaders

I'm writing CG shaders for advanced lighting calculation for game based on Unity. Sometimes it is needed to sum all vector components. There are two ways to do it:

  1. Just write something like: float sum = v.x + v.y + v.z;
  2. Or do something like: float sum = dot(v,float3(1,1,1));

I am really curious about what is faster and looks better for code style.

It's obvious that if we have same question for CPU calculations, the first simle way is much better. Because of:

a) There is no need to allocate another float(1,1,1) vector

b) There is no need to multiply every original vector "v" components by 1.

But since we do it in shader code, which runs on GPU, I belive there is some great hardware optimization for dot product function, and may be allocation of float3(1,1,1) will be translated in no allocation at all.

float4 _someVector;

void surf (Input IN, inout SurfaceOutputStandard o){
   float sum = _someVector.x + _someVector.y + _someVector.z + _someVector.w;
    // VS
   float sum2 = dot(_someVector, float4(1,1,1,1));
}
like image 239
Crabonog Avatar asked Mar 04 '23 11:03

Crabonog


2 Answers

Check this link.

Vec3 Dot has a cost of 3 cycles, while Scalar Add has a cost of 1. Thus, in almost all platforms (AMD and NVIDIA):

float sum = v.x + v.y + v.z; has a cost of 2 float sum = dot(v,float3(1,1,1)); has a cost of 3

The first implementation should be faster.

like image 107
kefren Avatar answered Mar 09 '23 19:03

kefren


Implementation of the Dot product in cg: https://developer.download.nvidia.com/cg/dot.html

IMHO difference is immeasurable, in 98% of the cases, but first one should be faster, because multiplication is a "more expensive" operation

like image 24
Menyus Avatar answered Mar 09 '23 18:03

Menyus