App engine consistent latency spikes under low load

Tags:

I've noticed periodic, but consistent latency spikes from my app running on app engine. At first I thought the network may be slow but app stats confirmed that not to be the case.

I've been able to reproduce the latency spikes using older and newer versions of the SDKs, currently I'm using the following:

App engine SDK: 1.9.42
Google cloud endpoints: 1.9.42
Objectify: 5.1.13
Appstats (to debug network latency)

So usage on the app is pretty low, over the last 30 days I'm generally under 0.04 requests a second:

requests per second

Most work is done with one instance as well: enter image description here

Most operations' latency is well under a second, but an alarming number of requests take 10-30x longer.

Latency density distribution

5% of requests take 23 seconds or longer...

So I figured it must just be network latency, but every appstat of a slower operation disproved this. Datastore and the network have always been incredibly reliable. Here is the anatomy of a slow request taking over 30 seconds:

app stats of operation taking 31 seconds

Here is the anatomy of a normal request: enter image description here

At a high level my code is pretty uninteresting: it's a simple api that makes a few network calls and saves / reads data from the cloud datastore. The entire source can be found on github here. The app runs on a single auto scaling app engine instance and is warmed up.

CPU usage over the last month doesnt seem to show anything exciting either: enter image description here

It's really strange to see that even for quick operations, a huge percentage of time is spent on the CPU, even though the code simply creates a few objects, persists them, and returns JSON. I'm wondering if the CPU is getting pegged on my app engine instance by another app which could be causing performance to periodically degrade.

My appengine.xml config looks like this:

<?xml version="1.0" encoding="utf-8"?>
<appengine-web-app xmlns="http://appengine.google.com/ns/1.0">
    <application>sauce-sync</application>
    <version>1</version>
    <threadsafe>true</threadsafe>
    <automatic-scaling>
        <!-- always keep an instance up in order to keep startup time low-->
        <min-idle-instances>1</min-idle-instances>
    </automatic-scaling>
</appengine-web-app>

And my web.xml looks like this:

<web-app xmlns="http://java.sun.com/xml/ns/javaee" version="2.5">
    <servlet>
        <servlet-name>SystemServiceServlet</servlet-name>
        <servlet-class>com.google.api.server.spi.SystemServiceServlet</servlet-class>
        <init-param>
            <param-name>services</param-name>
            <param-value>com.sauce.sync.SauceSyncEndpoint</param-value>
        </init-param>
    </servlet>
    <servlet-mapping>
        <servlet-name>SystemServiceServlet</servlet-name>
        <url-pattern>/_ah/spi/*</url-pattern>
    </servlet-mapping>

    <!--reaper-->
    <servlet>
        <servlet-name>reapercron</servlet-name>
        <servlet-class>com.sauce.sync.reaper.ReaperCronServlet</servlet-class>
    </servlet>
    <servlet-mapping>
        <servlet-name>reapercron</servlet-name>
        <url-pattern>/reapercron</url-pattern>
    </servlet-mapping>

    <servlet>
        <servlet-name>reaper</servlet-name>
        <servlet-class>com.sauce.sync.reaper.ReaperServlet</servlet-class>
    </servlet>
    <servlet-mapping>
        <servlet-name>reaper</servlet-name>
        <url-pattern>/reaper</url-pattern>
    </servlet-mapping>


    <welcome-file-list>
        <welcome-file>index.html</welcome-file>
    </welcome-file-list>

    <filter>
        <filter-name>ObjectifyFilter</filter-name>
        <filter-class>com.googlecode.objectify.ObjectifyFilter</filter-class>
    </filter>
    <filter-mapping>
        <filter-name>ObjectifyFilter</filter-name>
        <url-pattern>/*</url-pattern>
    </filter-mapping>
</web-app>

TLDR I'm completely stuck and I'm not sure how to debug or fix this issue and I'm starting to think this is business as usual for smaller apps on app engine.

I'm thinking about turning off the resident instance for a while hoping that my app has just been running some bunk hardware or along side an app that is consuming alot of resources. Has anyone run into similar performance issues or knows of additional ways to profile your app?

EDIT:

I've tried running on 1 resident instance, I've also tried setting concurrent requests at 2-4 per this question with no results. Logs and appstats both confirm that an inordinate amount of time is spent waiting for my code to initially run. Here is a request that takes 25 seconds before my first line of code is run, not sure what endpoints / app engine is doing in this time.

25 seconds before my code is run

Again load is still low and this is request is on a warmed up instance.

EDIT 2:

Seems like for whatever reason app engine + endpoints doesnt play well with min-idle-instances set. Reverting to the default app engine config fixed my problem.

enter image description here

861

asked Oct 21 '16 14:10

sauce

1 Answers

I don't have an answer, but I can offer you some debugging tips.

Appstats may or may not be reporting correctly. However, log messages get timestamped. Log before & after each of the RPC operations. That should give you some insight.

30s delays sound a lot like warmup requests, which should be marked clearly in the logs. One thing that I've found in the past is that setting any resident instances for low-traffic apps (unintuitively) tended to route a lot of requests to cold instances. Use the default setting and set up a cron task to ping and endpoint once per minute.

173

answered Oct 12 '22 08:10

stickfigure

Related questions
                            
                                Java Manual Validation after Argument Resolver
                            
                                Drawing a rounded rectangle with opacity on a BufferedImage
                            
                                Java geotools check if point is contained in polygon with tolerance
                            
                                SOAPExceptionImpl Bad response: 404Not Found if i don't do soapMessage.writeTo(System.out);
                            
                                How to iterate through two generic lists with different types one item after another in java?
                            
                                Getting sequential logs while executing tests in parallel
                            
                                Redis MSET with TTL in Spring RedisTemplate
                            
                                how to manage spring cache in cluster environment
                            
                                Many small modules, or few large modules for best build performance?
                            
                                Java8 slow compiling for interfaces with thousands of default methods with the same name
                            
                                RealmList serialization issues (Realm/Gson/Intent)
                            
                                How to avoid adding of inject method for each view?
                            
                                Elasticsearch NoNodeAvailableException None of the configured nodes are available
                            
                                PowerMock access private static members
                            
                                Limit simultaneously logged on user devices count
                            
                                Retrofit for android @Multipart remove default headers
                            
                                How to Set Default Gateway,Ip Address and Subnet mask from Java?
                            
                                Facing FileNotFoundException while accessing JSON File in classpath using java in docker containers(SprintBootApplication)
                            
                                Using DateTimeFormatter with ObjectMapper
                            
                                Why do longs and doubles take up two entries in Java classes' constant pools?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

App engine consistent latency spikes under low load

Tags:

java

google-app-engine

google-cloud-endpoints

objectify

sauce

People also ask

1 Answers

stickfigure

Recent Activity

Donate For Us