Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Under what conditions does Metal shader code "crash?"

Tags:

ios

metal

I'm developing a Metal-based app, and in some cases properly compiled and linked shader code will cause the application to simply crash without throwing any errors.

A "crash" consists of a halt in visual output (in some cases preceded by a short stutter of a couple alternating frames), but otherwise normal procession of the rest of the application. The Xcode performance monitoring utilities report 60fps but 0ms GPU latency, and CPU-side execution continues, with calls to the Metal API still completing successfully.

No errors are reported to the console.

This is extremely difficult to debug, as I have no indication of where in shader code the error is coming from. It would help if I knew under what conditions this is actually supposed to happen, so that I can have a good list of things to check. Otherwise I'm just shooting in the dark whenever this comes up.

like image 866
lcmylin Avatar asked Oct 20 '22 05:10

lcmylin


2 Answers

The GPU can crash when you read or write off the end of a MTLBuffer, write off the end of a MTLTexture, or simply run too long. There is a watchdog timer that will reset the GPU if it doesn't complete its work in less than a few seconds. Work on the GPU is not preemptively scheduled. It is possible for long running work to make the device seem locked up by preventing basic GUI tasks from executing. If you have long running workloads, it is necessary to split it up into many smaller kernels. To keep the interface responsive you should keep workloads < 100 ms. To avoid video stuttering, a consistent frame rate is recommended.

like image 77
Ian Ollmann Avatar answered Nov 15 '22 04:11

Ian Ollmann


I was having frequent crashes due to heavy Metal shaders as well and manged to fix it by throttling the dispatch rate. You can do this easily by measuring the runtime of the last "frame", and inserting a wait before every dispatch by a ratio of that amount:

[NSthread sleepFortimeInterval: _lastRunTime*RATIO];
NSDate *startTime = [NSDate date];
... [use Metal shaders] ...
_lastRunTime = -[startTime timeIntervalSinceNow];

I set the RATIO to 1.0. So it never uses more than 50% of gpu. It obviously impacts frame rate, but beats random crashes. You can play with the ratio. Nice thing is you don't have to worry about throttling too much or too little on different products, as its a ratio of runtime.

like image 45
Hashman Avatar answered Nov 15 '22 04:11

Hashman