I remember when I saw the initial talks on DirectX 12, I thought that it eliminated the need for texture atlasing. However it doesn't seem as obvious a conclusion now that I'm going through the documentation.
One feature I have seen that could replace it is dynamic non-uniform indexing of resource arrays in HLSL:
Texture2D<float4> textures[128];
SamplerState sampler;
textures[NonUniformResourceIndex(textureIndex)].Sample(sampler, uv);
Another potential feature is ExecuteIndirect
, which encodes what still are a bunch of separate draw and resource changes calls to a buffer and submits it to the GPU at once, in a single CPU call.
Both of these would address the limitations of texture atlases (inability to use border modes on atlas regions, problematic mipmapping), but I'm wondering if the performance characteristics or expected to be similar to texture atlasing or if that technique is still justified.
I'm also curious to know if the answer generalizes to Mantle, Vulkan and Metal.
The short answer is yes, the long answer is maybe there is cases where atlas will have a slight performance advantage.
With DX12 and Vulkan, you can forget about Mantle. The current representation of texture with the descriptors is close to the metal, using bindless does not involve much performance penalty and a texture fetch of a regular texture or a bindless one is the same on current hardware, and is likely to be even better in the future as it is the way to go.
on nVidia, there is absolutely no penalty, and NonUniformResourceIndex
is not a requirement with their architecture, bindless just work.
on AMD, NonUniformResourceIndex
has shader code generation implication that may have a cost if you multiply them and it is better to avoid them. Ideally, you do not use draw calls using more than one index at a time (in one instance or cross instance). This is because the GPU use a combination of vector and scalar registers. Texture and sampler descriptors are loaded in the scalar registers. If you have divergent index for your texture, it can't work properly. what does the NonUniformResourceIndex
thing is generating a loop over the active threads, consuming indices, masking the thread for it and do the fetch, looping until it has proceed with all the threads. But disregarding these consideration, a bindless texture use the same system than regular binding.
ExecuteIndirect
is also a very good deal, it is now not always perfectly optimize in the drivers, but with more games coming with DX12 engines, it will improve. This API is the open door to gpu culling and many solutions to reduce even more the cpu work. On Xbox One, it is even possible to change the PipelineStateObject
with it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With