Given a super-basic vertex shader such as:
output.position = mul(position, _gWorldViewProj);
I was having a great deal of trouble because I was setting _gWorldViewProj as follows; I tried both (a bit of flailing) to make sure it wasn't just backwards.
mWorldViewProj = world * view * proj;
mWorldViewProj = proj * view * world;
My solution turned out to be:
mWorldView = mWorld * mView;
mWorldViewProj = XMMatrixTranspose(worldView * proj);
Can someone explain why this XMMatrixTranspose was required? I know there were matrix differences between XNA and HLSL (I think) but not between vanilla C++ and HLSL, though I could be wrong.
Problem is I don't know if I'm wrong or what I'm wrong about! So if someone could tell me precisely why the transpose is required, I hopefully won't make the same mistake again.
On the CPU, 2D arrays are generally stored in row-major ordering, so the order in memory goes x[0][0], x[0][1], ... In HLSL, matrix declarations default to column-major ordering, so the order goes x[0][0], x[1][0], ...
In order to transform the memory from the format defined on the CPU to the order expected in HLSL, you need to transpose the CPU matrix before sending it to the GPU. Alternatively, you can row_major keyword in HLSL to declare the matrices as row major, eliminating the need for a transpose but leading to different codegen in HLSL (you'll often end up with mul-adds instead of dot-products).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With