Why does GL divide `gl_Position` by W for you rather than letting you do it yourself?

Question

Note: I understand the basic math. I understand that the typical perspective function in various math libraries produces a matrix that converts z values from -zNear to -zFar back into -1 to +1 but only if the result is divided by w

The specific question is what is gained by the GPU doing this for you rather than you having to do it yourself?

In other words, lets say the GPU did not magically divide gl_Position by gl_Position.w and that instead you had to do it manually as in

attribute vec4 position; uniform mat4 worldViewProjection;  void main() {   gl_Position = worldViewProjection * position;      // imaginary version of GL where we must divide by W ourselves   gl_Position /= gl_Position.w; }

What breaks in this imaginary GL because of this? Would it work or is there something about passing in the value before it's been divided by w that provides extra needed info to the GPU?

Note that if I actually do it the texture mapping perspective breaks.

"use strict"; var m4 = twgl.m4; var gl = twgl.getWebGLContext(document.getElementById("c")); var programInfo = twgl.createProgramInfo(gl, ["vs", "fs"]);  var bufferInfo = twgl.primitives.createCubeBufferInfo(gl, 2);  var tex = twgl.createTexture(gl, {   min: gl.NEAREST,   mag: gl.NEAREST,   src: [     255, 255, 255, 255,     192, 192, 192, 255,     192, 192, 192, 255,     255, 255, 255, 255,   ], });  var uniforms = {   u_diffuse: tex, };  function render(time) {   time *= 0.001;   twgl.resizeCanvasToDisplaySize(gl.canvas);   gl.viewport(0, 0, gl.canvas.width, gl.canvas.height);    gl.enable(gl.DEPTH_TEST);   gl.enable(gl.CULL_FACE);   gl.clear(gl.COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT);    var projection = m4.perspective(       30 * Math.PI / 180,        gl.canvas.clientWidth / gl.canvas.clientHeight,        0.5, 10);   var eye = [1, 4, -6];   var target = [0, 0, 0];   var up = [0, 1, 0];    var camera = m4.lookAt(eye, target, up);   var view = m4.inverse(camera);   var viewProjection = m4.multiply(projection, view);   var world = m4.rotationY(time);    uniforms.u_worldInverseTranspose = m4.transpose(m4.inverse(world));   uniforms.u_worldViewProjection = m4.multiply(viewProjection, world);    gl.useProgram(programInfo.program);   twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);   twgl.setUniforms(programInfo, uniforms);   gl.drawElements(gl.TRIANGLES, bufferInfo.numElements, gl.UNSIGNED_SHORT, 0);    requestAnimationFrame(render); } requestAnimationFrame(render);

body {  margin: 0; } canvas { display: block; width: 100vw; height: 100vh; }

<script id="vs" type="notjs"> uniform mat4 u_worldViewProjection; uniform mat4 u_worldInverseTranspose;  attribute vec4 position; attribute vec3 normal; attribute vec2 texcoord;  varying vec2 v_texcoord; varying vec3 v_normal;  void main() {   v_texcoord = texcoord;   v_normal = (u_worldInverseTranspose * vec4(normal, 0)).xyz;   gl_Position = u_worldViewProjection * position;   gl_Position /= gl_Position.w; }   </script>   <script id="fs" type="notjs"> precision mediump float;  varying vec2 v_texcoord; varying vec3 v_normal;  uniform sampler2D u_diffuse;  void main() {   vec4 diffuseColor = texture2D(u_diffuse, v_texcoord);   vec3 a_normal = normalize(v_normal);   float l = dot(a_normal, vec3(1, 0, 0));   gl_FragColor.rgb = diffuseColor.rgb * (l * 0.5 + 0.5);   gl_FragColor.a = diffuseColor.a; }   </script>   <script src="https://twgljs.org/dist/4.x/twgl-full.min.js"></script>   <canvas id="c"></canvas>

But, is that because the GPU actually needs z and w to be different or is it just GPU design and a different design could derive the info it needed if we did the w divide ourselves?

Update:

After asking this question I ended up writing this article that illustrates the perspective interpolation.

derhass · Accepted Answer

I'd like to extent on BDL's answer. It is not only about the perspective interpolation. It is also about the clipping. The space the value gl_Position is supposed to be provided in is called clip space, and this is before the division by w.

The (default) clip volume of OpenGL is defined in clip space as

-w <= x,y,z <= w   (with w varying per vertex)

After the division by w we get

-1 <= x,y,z <= 1   (in NDC coordinates).

However, if you try to do the clipping after the division by w, and would check against that cube in NDC, you get a problem, because all clip space points fullfilling this:

 w <= x,y,z <= -w (in clip space)

will also fullfill the NDC constraint.

The thing here is that points behind the camera will be transformed to somewhere in front of the camera, mirrored (since x/-1 is the same as -x/1). This also happens to the z coordinate. One might argue that this is irrelevant, because any point behind the camera is projected behind (in the sense of more far away than) the far plane, as per the construction of the typical projection matrix, so it will lie outside of the viewing volume in either case.

But if you have a primitive where at least one point is inside the view volume, and at least one point is behind the camera, you should have a primitive which intersects the near plane also. However, after the division by w, it will intersect the far plane now!. So clipping in NDC space, after the division, is much harder to get right. I tried to visualize this in this drawing:

top-down view of eye space and NDC with and without clipping (the drawing is to-scale, the depth range of projection is much shorter than anyone would typically use, to better illustrate the issue).

The clipping is done as a fixed-function stage in hardware and it has to be done before the division, hence you should provide the correct clip-space coordinates to work on.

(Note: actual GPUs might not use an extra clipping stage at all, they actually might also use a clipless rasterizer, like it is speculated in Fabian Giesen's blog article there. There are some algorithms like Olano and Greer (1997). However, this all works by doing the rasterization directly in homogenous coordinates, so we still need the w...)

BDL · Answer

The reason is, that not only gl_Position gets divided by the homogeneous coordinate, but also all other interpolated varyings. This is called perspective correct interpolation which requires the division to be after the interpolation (and thus after the rasterization). So doing the division in the vertex shader would simply not work. See also this post.

Why does GL divide `gl_Position` by W for you rather than letting you do it yourself?

Tags:

opengl

glsl

opengl-es

Update:

gman

2 Answers

derhass

BDL

Recent Activity

Donate For Us

Why does GL divide `gl_Position` by W for you rather than letting you do it yourself?

Tags:

opengl

glsl

opengl-es

Update:

gman

2 Answers

derhass

BDL

Related questions

Recent Activity

Donate For Us