Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Metal GPU frame time behaves unintuitively

i've ran into an interesting performance issue with Metal in my own app which i've been able to reproduce by only making small adjustment to this example project. my view has a size of roughly 1600x900 and looks like this:

enter image description here

there are two draw calls per frame, one for the background and one for the line. the background is made up of 4 vertices and the line is around 2000 vertices. when the scene is drawn like above, Xcode's GPU frame capture tells me that the entire frame takes ~4 ms (!). some observations:

  • when the line is drawn first, the frame only takes ~30 µs
  • when only the four vertices (that make up the background) are drawn, the frame takes ~3 µs
  • when only the line is drawn, the frame takes ~40 µs
  • when drawn like above, but with only the first ~900 vertices of the line, the entire frame takes ~4 µs

this doesn't make sense to me. why do the changes described above have such a drastic effect on the frame time? it's a 100x difference.

i'm running the code on a 2018 Mac mini (with Intel UHD Graphics 630 1536 MB), in case that is important.


here are the changes made to the demo project:

  1. create two MTLBuffers during intitialisation
AAPLVertex quadVertices[] = { ... 4 vertices omitted ... };
quadBuffer = [_device newBufferWithBytes:quadVertices length:4 * sizeof(AAPLVertex) MTLResourceStorageModeManaged];

AAPLVertex dataVertices[] = { ... ~2000 vertices omitted ... };
dataBuffer = [_device newBufferWithBytes:dataVertices length:2000 * sizeof(AAPLVertex) MTLResourceStorageModeManaged];
  1. draw both buffers in drawInMTKView:
[renderEncoder setVertexBuffer:quadBuffer offset:0 atIndex:AAPLVertexInputIndexVertices];
[renderEncoder drawPrimitives:MTLPrimitiveTypeTriangleStrip vertexStart:0 vertexCount:4];

[renderEncoder setVertexBuffer:dataBuffer offset:0 atIndex:AAPLVertexInputIndexVertices];
[renderEncoder drawPrimitives:MTLPrimitiveTypeTriangleStrip vertexStart:0 vertexCount:2000];
  1. turn on 8x MSAA: mtkView.sampleCount = 8; and pipelineStateDescriptor.sampleCount = 8;

  2. change the render pass's load action to MTLLoadActionLoad: renderPassDescriptor.colorAttachments[0].loadAction = MTLLoadActionLoad;

edit: the project is available on my Github.

edit 2: i ran the example project on a 2020 M1 Macbook and there i wasn't able to reproduce any of the bullet points. the total frame time was around 100 µs for the base-case. although, i had to use an MSAA factor of 4 since M1s apparently don't support 8.


to be transparent, i've also asked this question on the Apple Developer forums: https://developer.apple.com/forums/thread/695245 (i hope that's ok)

like image 880
maxjvh Avatar asked Nov 07 '22 00:11

maxjvh


1 Answers

Looks like it's a bug, I ran your sample projects on the following processors:
M1, M1 Pro, M1 Max, Radeon pro 5500M.

I wasn't able to reproduce your issue. I suggest you file bug report.

like image 128
Hamid Yusifli Avatar answered Nov 15 '22 12:11

Hamid Yusifli