GLSL: scalar vs vector performance

Question

All modern GPUs have scalar architecture, but shading languages offer a variety of vector and matrix types. I would like to know, how does scalarization or vectorization of GLSL source code affect performance. For example, let's define some "scalar" points:

float p0x, p0y, p1x, p1y, p2x, p2y, p3x, p3y, p4x, p4y;
p0x = 0.0f; p0y = 0.0f;
p1x = 0.0f; p1y = 0.61f;
p2x = 0.9f; p2y = 0.4f;
p3x = 1.0f; p3y = 1.0f;

and their vector equivalents:

vec2 p0 = vec2(p0x, p0y);
vec2 p1 = vec2(p1x, p1y);
vec2 p2 = vec2(p2x, p2y);
vec2 p3 = vec2(p3x, p3y);

Having these points, which of the following mathematically equivalent pieces of code will run faster?

Scalar code:

position.x = -p0x*pow(t-1.0,3.0)+p3x*(t*t*t)+p1x*t*pow(t-1.0,2.0)*3.0-p2x*(t*t)*(t-1.0)*3.0;
position.y = -p0y*pow(t-1.0,3.0)+p3y*(t*t*t)+p1y*t*pow(t-1.0,2.0)*3.0-p2y*(t*t)*(t-1.0)*3.0;

or it's vector equivalent:

position.xy = -p0*pow(t-1.0,3.0)+p3*(t*t*t)+p1*t*pow(t-1.0,2.0)*3.0-p2*(t*t)*(t-1.0)*3.0;

?

Or will they run equivalently fast on modern GPUs?

The above code is only an example. Real-life examples of such "vectorizable" code may perform much heavier computations with much more input variables coming from global ins, uniforms and vertex attributes.

barneypitt · Accepted Answer

The vectorised version is highly unlikely to be slower - in the worst case, it will probably just be replaced with the scalar version by the compiler anyway.

It may however be faster. Whether it will be faster largely depends on whether the code branches - if there are no branches, it is easier to feed the processing to multiple SIMD lanes than with code which branches. Compilers are pretty smart, and might be able to figure out that the scalar version can also be sent to multiple SIMD lanes ... but the compiler is more likely to be able to do its job to the best of its ability using the vectorised version. They're also smart enough to sometimes keep the SIMD lanes fed in the presence of limited branching, so even with branching code you are probably better off using the vectorised version.

GLSL: scalar vs vector performance

Tags:

performance

vectorization

opengl

glsl

Sergey

1 Answers

barneypitt

Recent Activity

Donate For Us

GLSL: scalar vs vector performance

Tags:

performance

vectorization

opengl

glsl

Sergey

1 Answers

barneypitt

Related questions

Recent Activity

Donate For Us