I am trying to render a 2D triangle using user touches. So, I will let a user touch three points on the screen and those points will be used as vertices of a triangle.
You're already aware that you need to return clip-space coordinates (technically not normalized device coordinates) from your vertex shader. The question is how and where to go from UIKit coordinates to Metal's clip-space coordinates.
Let's start by defining these different spaces. Note that below, I actually am using NDC coordinates for the sake of simplicity, since in this particular case, we aren't introducing perspective by returning vertex positions with w != 1. (Here I'm referring to the w coordinate of the clip-space position; in the following discussion, w always refers to the view width).

We pass the vertices into our vertex shader in whatever space is convenient (this is often called model space). Since we're working in 2D, we don't need the usual series of transformations to world space, then eye space. Essentially, the coordinates of the UIKit view are our model space, world space, and eye space all in one.
We need some kind of orthographic projection matrix to move from this space into clip space. If we strip out the unnecessary parts related to the z axis and assume that our view bounds' origin is (0, 0), we come up with the following transformation:
We could pass this matrix into our vertex shader, or we could do the transformation prior to sending the vertices to the GPU. Considering how little data is involved, it really doesn't matter at this point. In fact, using a matrix at all is a little wasteful, since we can just transform each coordinate with a couple of multiplies and an add. Here's how that might look in a Metal vertex function:
float2 inverseViewSize(1.0f / width, 1.0f / height); // passed in a buffer
float clipX = (2.0f * in.position.x * inverseViewSize.x) - 1.0f;
float clipY = (2.0f * -in.position.y * inverseViewSize.y) + 1.0f;
float4 clipPosition(clipX, clipY, 0.0f, 1.0f);
Just to verify that we get the correct results from this transformation, let's plug in the upper-left and lower-right points of our view to ensure they wind up at the extremities of clip space (by linearity, if these points transform correctly, so will all others):

These points appear correct, so we're done. If you're concerned about the apparent distortion introduced by this transformation, note that it is exactly canceled by the viewport transformation that happens prior to rasterization.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With