In my Android application, I use a SurfaceView
to draw things. It has been working fine on thousands of devices -- except that now users started reporting ANRs on the following devices:
So I got an LG G4 and was indeed able to verify the problem. It's directly related to the SurfaceView
.
Now guess what fixed the issue after hours of debugging? It is replacing ...
mSurfaceHolder.unlockCanvasAndPost(c);
... with ...
mSurfaceHolder.unlockCanvasAndPost(c);
System.out.println("123"); // THIS IS THE FIX
How can this be?
The following code is my render thread that has been working fine except for the mentioned devices:
import android.graphics.Canvas;
import android.view.SurfaceHolder;
public class MyThread extends Thread {
private final SurfaceHolder mSurfaceHolder;
private final MySurfaceView mSurface;
private volatile boolean mRunning = false;
public MyThread(SurfaceHolder surfaceHolder, MySurfaceView surface) {
mSurfaceHolder = surfaceHolder;
mSurface = surface;
}
public void setRunning(boolean run) {
mRunning = run;
}
@Override
public void run() {
Canvas c;
while (mRunning) {
c = null;
try {
c = mSurfaceHolder.lockCanvas();
if (c != null) {
mSurface.doDraw(c);
}
}
finally { // when exception is thrown above we may not leave the surface in an inconsistent state
if (c != null) {
try {
mSurfaceHolder.unlockCanvasAndPost(c);
}
catch (Exception e) { }
}
}
}
}
}
The code is, in parts, from the LunarLander
example in the Android SDK, more specifically LunarView.java
.
Updating the code to match the improved example from Android 6.0 (API level 23) yields the following:
import android.graphics.Canvas;
import android.view.SurfaceHolder;
public class MyThread extends Thread {
/** Handle to the surface manager object that we interact with */
private final SurfaceHolder mSurfaceHolder;
private final MySurfaceView mSurface;
/** Used to signal the thread whether it should be running or not */
private boolean mRunning = false;
/** Lock for `mRunning` member */
private final Object mRunningLock = new Object();
public MyThread(SurfaceHolder surfaceHolder, MySurfaceView surface) {
mSurfaceHolder = surfaceHolder;
mSurface = surface;
}
/**
* Used to signal the thread whether it should be running or not
*
* @param running `true` to run or `false` to shut down
*/
public void setRunning(final boolean running) {
// do not allow modification while any canvas operations are still going on (see `run()`)
synchronized (mRunningLock) {
mRunning = running;
}
}
@Override
public void run() {
while (mRunning) {
Canvas c = null;
try {
c = mSurfaceHolder.lockCanvas(null);
synchronized (mSurfaceHolder) {
// do not allow flag to be set to `false` until all canvas draw operations are complete
synchronized (mRunningLock) {
// stop canvas operations if flag has been set to `false`
if (mRunning) {
mSurface.doDraw(c);
}
}
}
}
// if an exception is thrown during the above, don't leave the view in an inconsistent state
finally {
if (c != null) {
mSurfaceHolder.unlockCanvasAndPost(c);
}
}
}
}
}
But still, this class does not work on the mentioned devices. I get a black screen and the application stops responding.
The only thing (that I have found) that fixes the problem is adding the System.out.println("123")
call. And adding a short sleep time at the end of the loop turned out to provide the same results:
try {
Thread.sleep(10);
}
catch (InterruptedException e) { }
But these are no real fixes, are they? Isn't that strange?
(Depending on what changes I make to the code, I'm also able to see an exception in the error log. There are many developers with the same problem but unfortunately none does provide a solution for my (device-specific) case.
Can you help?
What is currently working for me, although not really fixing the cause of the problem but fighting the symptoms superficially:
Canvas
operationsMy render thread calls the custom method doDraw(Canvas canvas)
on the SurfaceView
subclass.
In that method, if I remove all calls to Canvas.drawBitmap(...)
, Canvas.drawRect(...)
and other operations on the Canvas
, the app does not freeze anymore.
A single call to Canvas.drawColor(int color)
may be left in the method. And even costly operations like BitmapFactory.decodeResource(Resources res, int id, Options opts)
and reading/writing to my internal Bitmap
cache is fine. No freezes.
Obviously, without any drawing, the SurfaceView
is not really helpful.
The method that my render thread executes:
@Override
public void run() {
Canvas c;
while (mRunning) {
c = null;
try {
c = mSurfaceHolder.lockCanvas();
if (c != null) {
mSurface.doDraw(c);
}
}
finally {
if (c != null) {
try {
mSurfaceHolder.unlockCanvasAndPost(c);
}
catch (Exception e) { }
}
}
}
}
Simply adding a short sleep time within the loop (e.g. at the end) fixes all freezing on the LG G4:
while (mRunning) {
...
try { Thread.sleep(10); } catch (Exception e) { }
}
But who knows why this works and if this really fixes the problem (on all devices).
System.out
The same thing that worked with Thread.sleep(...)
above does also work with System.out.println("123")
, strangely.
This is how I start my render thread from within the SurfaceView
:
@Override
public void surfaceCreated(SurfaceHolder surfaceHolder) {
mRenderThread = new MyThread(getHolder(), this);
mRenderThread.setRunning(true);
mRenderThread.start();
}
When wrapping these three lines inside the following delayed execution, the app does not freeze anymore:
new Handler().postDelayed(new Runnable() {
@Override
public void run() {
...
}
}, 10);
This seems to be because there is only a presumable deadlock right in the beginning. If this is cleared (with the delayed execution), there is no other deadlock. The app runs just fine after that.
But when leaving the Activity
, the app freezes again.
Apart from the LG G4, Sony Xperia Z4, Huawei Ascend Mate 7, HTC M9 (and probably a few other devices), the app is working fine on thousands of devices.
Could this be a device-specific glitch? One would surely have heard about this ...
All these "solutions" are hacky. I wish there was a better solution -- and I bet there is!
Look at the ANR trace. Where does it appear to be stuck? ANRs mean the main UI thread is failing to respond, so what you're doing on the renderer thread is irrelevant unless the two are fighting over a lock.
The symptoms you're reporting sound like a race. If your main UI thread is stuck on, say, mRunningLock
, it's conceivable that your renderer thread is only leaving it unlocked for a very short window. Adding the log message or sleep call gives the main thread an opportunity to wake up and do work before the renderer thread grabs it again.
(This doesn't actually make sense to me -- your code looks like it should be stalled waiting for lockCanvas()
while awaiting the display refresh -- so you need to look at the thread trace in the ANR.)
FWIW, you don't need to synchronize on mSurfaceHolder
. An early example did that, and every example since then has cloned it.
Once you get this sorted out, you may want to read about game loops.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With