I added a new GL renderer to my engine, which uses the core profile. While it runs fine on Windows and/or nvidia cards, it is like 10 times slower on OS X (3 fps instead of 30). The weird thing is, that my compatibility profile renderer runs fine.
I collected some traces with Instruments and the GL profiler:
https://www.dropbox.com/sh/311fg9wu0zrarzm/31CGvUcf2q
It shows that the application spends its time in glDrawRangeElements. I tried the following things:
What I didn't try is to align my vertices to 16 byte boundary and/or convert indices to 4 byte, but seriously, if that would be the issue then why the hell does the standard allow it?
I'm creating the context like this:
NSOpenGLPixelFormatAttribute attributes[] =
{
NSOpenGLPFAColorSize, 24,
NSOpenGLPFAAlphaSize, 8,
NSOpenGLPFADepthSize, 24,
NSOpenGLPFAStencilSize, 8,
NSOpenGLPFADoubleBuffer,
NSOpenGLPFAAccelerated,
NSOpenGLPFANoRecovery,
NSOpenGLPFAOpenGLProfile, NSOpenGLProfileVersion3_2Core,
0
};
NSOpenGLPixelFormat* format = [[NSOpenGLPixelFormat alloc] initWithAttributes:attributes];
NSOpenGLContext* context = [[NSOpenGLContext alloc] initWithFormat:format shareContext:nil];
[self.view setOpenGLContext:context];
[context makeCurrentContext];
Tried on the following specs:
Do you have any ideas what I might do wrong? Again, it works fine with the compatibility profile (not using VAOs though).
UPDATE: reported to Apple.
UPDATE: Apple doesn't give a damn to the problem...anyway I created a small test program which is actually good. Now I compared the call stack with Instruments, and found out that when using the engine, glDrawRangeElements does two calls:
while in the test program it calls only the second. Now the first call does something like an immediate mode render (gleFlushPrimitivesTCLFunc, gleRunVertexSubmitterImmediate), so obviously casues the slowdown.
OpenGL is officially deprecated by Apple starting with macOS Mojave 10.14.
If you look at Windows or Linux OpenGL implementations, AMD and NVIDIA have loads of extensions they support. Those don't exist on MacOS, because MacOS controls what extensions are available. It didn't have to do that; it could have allowed the driver to expose whatever it wanted, which was done on other platforms.
Finally, I was able to reproduce the slowdown. This is just crazy... It is clearly caused by glBindAttribLocation being called on the "my_Position" attribute. Now I did some testing:
Obviously I relink the program (check code). It is not a problem in the implementation, I tested it with "normal" values too.
Test program:
https://www.dropbox.com/s/dgg48g1fwgyc5h0/SLOWDOWN_REPRO.zip
How to repro:
I have managed to get myself the same problem in the following circumstance under
OS X Mavericks:
Instanced rendering using array buffers to give each instance its own modelToWorld
and inverseNormal
matrices; attribute locations are being specified through layout rather than using glGetAttribLocation
leaving one of these array buffers unused in the shader
, where location is declared but the attribute isn't actually used for anything in the glsl
code
In this case, a call to glDrawElementsInstanced
takes up a LOT of CPU time (under normal circumstances, this call uses nearly zero CPU even when drawing several thousand instances).
You can tell that you're getting this specific problem if almost all of the CPU time used within glDrawElementsInstanced
is spent in gleDrawArraysOrElements_ExecCore
. Making sure that all of the array buffers are actually referenced in your shader
code fixes the CPU time back to (nearly) zero.
I suspect that this is one of the situations where leaving a variable out of your main() in glsl
confuses the compiler in to deleting all reference to that variable, leaving you with a dangling reference to an attribute or uniform.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With