Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

glDrawElements massive cpu usage on iOS

Hardware: iPad2 Sofware: OpenGL ES 2.0 C++

glDrawElements seems to take up about 25% of the cpu. Making the CPU 18ms and the GPU 10ms per frame.

When I don't use an index buffer and use glDrawArrays, it speeds up and glDrawArrays barley shows up on the profiler. Everything else is the same, glDrawArrays has more verts because I have to duplicate verts in the VBO without the index buffer.

so far:

  • virtually the same amount of state changes between the two methods
  • vertex structure is two floats(8 bytes).
  • indexbuffer is 16bit(tried 32bit as well)
  • GL_SATIC_DRAW for both buffers
  • buffers don't change after load
  • the same VBO and the indexbuffer render multiple times per frame, with different offsets and sizes
  • no opengl errors

So it looks like it's doing a software fallback of some sort. But I can't figure out what would cause OpenGL to fallback.

like image 519
myro Avatar asked Oct 31 '22 15:10

myro


1 Answers

There are a few things that immediately jump to mind that might affect speed the way you describe.

For one, many commands are issued passively to reduce the number of bus transfers. They are queued up and wait for the next batch transfer. State changes, texture changes, and similar commands all accumulate. It is possible that the the draw commands are triggering a larger transfer in the one case but not in the other, or that you are triggering more frequent transfers in the one case or the other. For another, your specific models might be better organized for one or the other draw calls. You need to look at how big they are, if they reuse index values, and if they are optimized or reordered for rendering. glDrawArrays may require more data to be transferred, but if your models are small the overhead may not be much of a concern. Draw frequency becomes important since you want to queue off calls frequently to keep the card busy and let your CPU do other work, you don't want it to just accumulate in the command buffer waiting to be sent, but it needs to be balanced since there is a cost with those transfers. And to top it off, frequently indexed values can benefit from cache effects when they are frequently reused, but linearly accessed arrays can benefit from cache effects when they are accessed linearly, so you need to know your data since different types of data benefit from different methods.

Even Apple seems to be unsure which method to use.

Up until iOS7 the OpenGL ES Programming Guide for IOS for that version and earlier wrote:

For best performance, your models should be submitted as a single unindexed triangle strip using glDrawArrays with as few duplicated vertices as possible. If your models require many vertices to be duplicated (...), you may obtain better performance using a separate index buffer and calling glDrawElements instead. ... For best results, test your models using both indexed and unindexed triangle strips, and use the one that performs the fastest.

But their updated OpenGL ES Programming Guide for iOS that applies to iOS8 offers the opposite:

For best performance, your models should be submitted as a single indexed triangle strip. To avoid specifying data for the same vertex multiple times in the vertex buffer, use a separate index buffer and draw the triangle strip using the glDrawElements function

It looks like in your case you have just tried both, and found that one method is better suited for your data.

like image 173
Bryan Avatar answered Nov 11 '22 14:11

Bryan