Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ANR in SurfaceView on specific devices only -- The only fix is a short sleep time

In my Android application, I use a SurfaceView to draw things. It has been working fine on thousands of devices -- except that now users started reporting ANRs on the following devices:

  • LG G4
    • Android 5.1
    • 3 GB RAM
    • 5.5" display
    • 2560 x 1440 px resolution
  • Sony Xperia Z4
    • Android 5.0
    • 3 GB RAM
    • 5,2" display
    • 1920 x 1080 px resolution
  • Huawei Ascend Mate 7
    • Android 5.1
    • 3 GB RAM
    • 6.0" display
    • 1920 x 1080 px resolution
  • HTC M9
    • Android 5.1
    • 3 GB RAM
    • 5.0" display
    • 1920 x 1080 px resolution

So I got an LG G4 and was indeed able to verify the problem. It's directly related to the SurfaceView.

Now guess what fixed the issue after hours of debugging? It is replacing ...

mSurfaceHolder.unlockCanvasAndPost(c);

... with ...

mSurfaceHolder.unlockCanvasAndPost(c);
System.out.println("123"); // THIS IS THE FIX

How can this be?

The following code is my render thread that has been working fine except for the mentioned devices:

import android.graphics.Canvas;
import android.view.SurfaceHolder;

public class MyThread extends Thread {

    private final SurfaceHolder mSurfaceHolder;
    private final MySurfaceView mSurface;
    private volatile boolean mRunning = false;

    public MyThread(SurfaceHolder surfaceHolder, MySurfaceView surface) {
        mSurfaceHolder = surfaceHolder;
        mSurface = surface;
    }

    public void setRunning(boolean run) {
        mRunning = run;
    }

    @Override
    public void run() {
        Canvas c;
        while (mRunning) {
            c = null;
            try {
                c = mSurfaceHolder.lockCanvas();
                if (c != null) {
                    mSurface.doDraw(c);
                }
            }
            finally { // when exception is thrown above we may not leave the surface in an inconsistent state
                if (c != null) {
                    try {
                        mSurfaceHolder.unlockCanvasAndPost(c);
                    }
                    catch (Exception e) { }
                }
            }
        }
    }

}

The code is, in parts, from the LunarLander example in the Android SDK, more specifically LunarView.java.

Updating the code to match the improved example from Android 6.0 (API level 23) yields the following:

import android.graphics.Canvas;
import android.view.SurfaceHolder;

public class MyThread extends Thread {

    /** Handle to the surface manager object that we interact with */
    private final SurfaceHolder mSurfaceHolder;
    private final MySurfaceView mSurface;
    /** Used to signal the thread whether it should be running or not */
    private boolean mRunning = false;
    /** Lock for `mRunning` member */
    private final Object mRunningLock = new Object();

    public MyThread(SurfaceHolder surfaceHolder, MySurfaceView surface) {
        mSurfaceHolder = surfaceHolder;
        mSurface = surface;
    }

    /**
     * Used to signal the thread whether it should be running or not
     *
     * @param running `true` to run or `false` to shut down
     */
    public void setRunning(final boolean running) {
        // do not allow modification while any canvas operations are still going on (see `run()`)
        synchronized (mRunningLock) {
            mRunning = running;
        }
    }

    @Override
    public void run() {
        while (mRunning) {
            Canvas c = null;

            try {
                c = mSurfaceHolder.lockCanvas(null);
                synchronized (mSurfaceHolder) {
                    // do not allow flag to be set to `false` until all canvas draw operations are complete
                    synchronized (mRunningLock) {
                        // stop canvas operations if flag has been set to `false`
                        if (mRunning) {
                            mSurface.doDraw(c);
                        }
                    }
                }
            }
            // if an exception is thrown during the above, don't leave the view in an inconsistent state
            finally {
                if (c != null) {
                    mSurfaceHolder.unlockCanvasAndPost(c);
                }
            }
        }
    }

}

But still, this class does not work on the mentioned devices. I get a black screen and the application stops responding.

The only thing (that I have found) that fixes the problem is adding the System.out.println("123") call. And adding a short sleep time at the end of the loop turned out to provide the same results:

try {
    Thread.sleep(10);
}
catch (InterruptedException e) { }

But these are no real fixes, are they? Isn't that strange?

(Depending on what changes I make to the code, I'm also able to see an exception in the error log. There are many developers with the same problem but unfortunately none does provide a solution for my (device-specific) case.

Can you help?

like image 296
caw Avatar asked Dec 16 '15 07:12

caw


2 Answers

What is currently working for me, although not really fixing the cause of the problem but fighting the symptoms superficially:

1. Removing Canvas operations

My render thread calls the custom method doDraw(Canvas canvas) on the SurfaceView subclass.

In that method, if I remove all calls to Canvas.drawBitmap(...), Canvas.drawRect(...) and other operations on the Canvas, the app does not freeze anymore.

A single call to Canvas.drawColor(int color) may be left in the method. And even costly operations like BitmapFactory.decodeResource(Resources res, int id, Options opts) and reading/writing to my internal Bitmap cache is fine. No freezes.

Obviously, without any drawing, the SurfaceView is not really helpful.

2. Sleeping 10ms in the render thread's run loop

The method that my render thread executes:

@Override
public void run() {
    Canvas c;
    while (mRunning) {
        c = null;
        try {
            c = mSurfaceHolder.lockCanvas();
            if (c != null) {
                mSurface.doDraw(c);
            }
        }
        finally {
            if (c != null) {
                try {
                    mSurfaceHolder.unlockCanvasAndPost(c);
                }
                catch (Exception e) { }
            }
        }
    }
}

Simply adding a short sleep time within the loop (e.g. at the end) fixes all freezing on the LG G4:

while (mRunning) {
    ...

    try { Thread.sleep(10); } catch (Exception e) { }
}

But who knows why this works and if this really fixes the problem (on all devices).

3. Printing something to System.out

The same thing that worked with Thread.sleep(...) above does also work with System.out.println("123"), strangely.

4. Delaying the start of the render thread by 10ms

This is how I start my render thread from within the SurfaceView:

@Override
public void surfaceCreated(SurfaceHolder surfaceHolder) {
    mRenderThread = new MyThread(getHolder(), this);
    mRenderThread.setRunning(true);
    mRenderThread.start();
}

When wrapping these three lines inside the following delayed execution, the app does not freeze anymore:

new Handler().postDelayed(new Runnable() {

    @Override
    public void run() {
        ...
    }

}, 10);

This seems to be because there is only a presumable deadlock right in the beginning. If this is cleared (with the delayed execution), there is no other deadlock. The app runs just fine after that.

But when leaving the Activity, the app freezes again.

5. Just use a different device

Apart from the LG G4, Sony Xperia Z4, Huawei Ascend Mate 7, HTC M9 (and probably a few other devices), the app is working fine on thousands of devices.

Could this be a device-specific glitch? One would surely have heard about this ...


All these "solutions" are hacky. I wish there was a better solution -- and I bet there is!

like image 127
caw Avatar answered Nov 19 '22 07:11

caw


Look at the ANR trace. Where does it appear to be stuck? ANRs mean the main UI thread is failing to respond, so what you're doing on the renderer thread is irrelevant unless the two are fighting over a lock.

The symptoms you're reporting sound like a race. If your main UI thread is stuck on, say, mRunningLock, it's conceivable that your renderer thread is only leaving it unlocked for a very short window. Adding the log message or sleep call gives the main thread an opportunity to wake up and do work before the renderer thread grabs it again.

(This doesn't actually make sense to me -- your code looks like it should be stalled waiting for lockCanvas() while awaiting the display refresh -- so you need to look at the thread trace in the ANR.)

FWIW, you don't need to synchronize on mSurfaceHolder. An early example did that, and every example since then has cloned it.

Once you get this sorted out, you may want to read about game loops.

like image 39
fadden Avatar answered Nov 19 '22 07:11

fadden