Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

glGetError and performance

Background

At work, we develop two products which both have OpenGL 3.x+ and GLES 2.0/3.0+ backends. Teams are independent, but do have some overlap, and we were recently discussing performance of glGetError.

In both products, the design is such that no GL calls should generate an error code recorded by glGetError. To detect such errors, in debug we have a macro that adds a glGetError after each GL call, and it asserts if any errors are detected, as this means there is a bug. On my product, this is enabled by default, in the other, it must be explicitly enabled.

These have been present in the codebase of the product I work on for many years, and we see that they cause performance hits, generally it in the neighbourhood of 25% across many platforms. We have decided this is a reasonable price to pay for early detection of errors. The other team claimed under some circumstances, that adding these checks can slow down execution of their product running at 60FPS to < 1FPS, making the product unusable, which is why they are not enabled by default. Both products run on many OpenGL/GLES platforms (PC, OSX, Linux, iOS and Android).

Questions

I understand the reasoning behind glGetError reducing performance; you (may) require a CPU/GPU synchronization for the status of the previous operation to be correct. From my understanding this should change the expected frame time from "MAX(CPU time, GPU time)" (assuming no other sync points, and no queued frames) to "CPU time + GPU time + synchronization overheap" (assuming every glGetError call results in a sync point). Is this not correct reasoning, or, is there is additional reason for performance reduction using glGetError?

I was always under the impression that per-call glGetError in debug was a reasonable thing to do (at least after GL calls where no errors should be possible). Is this not the case or not considered a 'best practice'? Is there some circumstance(s) that can cause extreme performance issues such as the other team described (eg. with a particular set of GL calls and/or platform)?

like image 790
MuertoExcobito Avatar asked May 31 '15 16:05

MuertoExcobito


3 Answers

glGetError() does not really have to wait for anything from the GPU. All the errors it reports are from checking arguments of API calls, as well as internal state managed by the driver. So CPU/GPU synchronization does not come into play here.

The only error that may appear deferred is GL_OUT_OF_MEMORY, but the spec is fairly open about this one ("may be generated"), so it's not really a reason for synchronization either.

I can think of two reasons why calling glGetError() after each API call may significantly reduce performance:

  • You make twice as many OpenGL calls. There is overhead for the call itself, as well as for checking and returning the error state. While calling glGetError() once may not be very expensive, it adds up if you call it millions of times.
  • Some drivers use multithreading inside the driver. In this case, glGetError() will cause synchronization between threads in the driver, which could have a very substantial performance impact if it happens very frequently.

As to what you should be doing, you really have to find out what works. Some thoughts/suggestions:

  • I definitely would not call glGetError() in a release build. It's very useful during debugging, but unnecessary overhead once your testing/QA is completed.
  • Errors are sticky. So if you just want to know if there are any errors, you don't have to call glGetError() after each call. For example, you can just call it once at the end of each frame. Of course if you get an error, and want to know which call caused it, you need the more frequent calls. So you could have multiple build types.

    • Release build with no glGetError() calls.
    • Testing/QA build with glGetError() call at the end of each frame.
    • Debug build with glGetError() call after each OpenGL call.
like image 175
Reto Koradi Avatar answered Oct 21 '22 09:10

Reto Koradi


There may be some sort of CPU/GPU synch required to query the error state, but I think it's overblown. It's nothing at all like reading back the result of a rendering operation that's still in-flight or pending. The error state is something that's validated and set before commands are executed, it will usually alert you to invalid API usage or state setup but not much else.

Modern OpenGL implementations have a much more sophisticated extension / core feature for tracking debug information, called simply, "Debug Output". You have tagged this OpenGL as well as OpenGL ES, so it may be inappropriate for all deployments of your software, but when using an OpenGL or ES implementation where this feature is available, it should really be your go to solution for this. You will get error information of course, but additionally warnings for things like deprecation and performance (it's really down to how verbose the driver is and I've seen some drivers that give really excellent warnings, others that don't use the feature at all).

You can run debug output synchronously, which may introduce the performance penalties you discussed in your question, or asynchronously, which tends to be more performance friendly but slightly less useful when trying to track down the cause of a problem in real-time. There's no one-size fits all solution, and that's why debug output is much more flexible and explicit than glGetError (...).

like image 30
Andon M. Coleman Avatar answered Oct 21 '22 10:10

Andon M. Coleman


Well, I would consider triggering a full CPU/GPU sync very unlikely in this case (but not impossible). The GPU does know nothing about the GL client side errors, and all the ressources the GPU is going to use are managed by the CPU, so there is not much which can go wrong on this point that the GPU could report. Usually, if things can go "wrong" on the GPU side as a result of some user error, there results are just undefined, but will not trigger a GL error.

Having said that, I don't want to imply that the overhead of a glGetError call is low. Modern GL implementations are heavily multithreaded. Usually, the GL calls itself which will just forward the commands and data to some other worker thread in the background and try to return as early as possible so that the application can go on. Querying an error means you have to sync with all those worker threads, which can siginificantly lag behind.

Is there some circumstance(s) that can cause extreme performance issues such as the other team described

Well, the reported performance impact is definitively possible. But trying to find out what exactly triggers this will be very difficult. I'm not aware of any specific conditions where error checks are extraordinarily bad, and I doubt that an easy set of rules of thumb can be derived for such things. The complexity is just too high.

When you ask for best practices, we enter the area of opionions. It will always depend on the specific scenario. I never had error checks after each GL call. I have some error checks at "strategic places" which are always enabled, usually at ressource setup, but never in the "fast path". Futhermore, I used to have additional checks at "strategic" places enabled by default in debug builds. I also often had some extra marco to enable more checks, to narrow down occuring errors easily.

However, these checks became less and less useful over time. Nowadays, there are GL debugging tools which can help you identify the failing GL call.

Another very useful concept are debug contexts as introduced by the ARB_debug_output or KHR_debug extensions (the latter one is also defined as GLES extensions, but I don't know how widely available it is). This basically allows setting up a callback which the GL will call and so the "polling" for errors is replaced by a notification mechanism. I strongly recommend using debug contexts in debug builds (if available, of course). It might even be a good idea to be able to optionally enable those even in release buils, since it might help debugging on a customer's system while it will introduce abosultely no overhead as long as it stays disabled.

like image 40
derhass Avatar answered Oct 21 '22 10:10

derhass