Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing Application Architecture and Implementation for Google App Engine

It's my understanding that billing on GAE all boils down to instance-hours ("IH"), or how many server instances are running for some duration of time. However, it is obviously not that simple, because in addition to IH you quotas and resource limits that you must be leary of throughout the course of the day (since quotas replenish every 24 hours).

I am in the process of designing my first GWT/GAE app, and have come across many articles (some of which are cited below) where the authors talk about major refactorings they had to make to their code - post release - in order to help minimize billing and operational costs with Google.

In one instance, a developer implemented a set of optimizations to his GAE app which caused the same app to go from $7/day (~$220/month) down to $0 because it was finally under the "free" quotas and billing thresholds for resources.

Being so new to GAE, I'm wondering if there are any set of principles or practices I can incorporate into the architecture/design of my app upfront, that once trickled down into implemented, functional code and deployed to GAE, will cause the app to run as efficiently (monetarily-speaking) as possible.

Here are some deductions I've made so far:

  • Maximize caching and minimize datastore hits
  • Try to push as many asynchronous request handling to backend instances as possible
  • Enable concurrent HTTP request handling so that the same instance can handle multiple requests at the same time

So my question: are any of these generalizations I've made incorrect, and if so, why (or are they conditional, where they hold true in some cases but not in others)? Am I missing anything critical here? For instance, how to determine what code belongs on a backend instance (where resource constraints are little more lax), making use of what kinds of GAE-specific profiling tools (AppStats, SpeedTracer, etc.) to see bottlenecks, etc.

Also, some cited articles:

  • Configuring Max Idle and Minimum Latency
  • GAE's own scaling best practices
  • An example of CPU optimization
like image 287
IAmYourFaja Avatar asked Aug 22 '12 18:08

IAmYourFaja


People also ask

What is the architecture of Google App Engine?

The App Engine architecturePlatform as a Service (PaaS) to build and deploy scalable applications. Hosting facility in fully-managed data centers. A fully-managed, flexible environment platform for managing application server and infrastructure. Support in the form of popular development languages and developer tools.

What are the core components of Google App Engine Architecture?

The App Engine hierarchy has four components - application, services, versions, and instances. An application that the customer needs is a combination of multiple services, where each service can have various versions that are deployed in instances.

What are the key features in Google App Engine application environment?

Benefits of GAE GAE is fully managed, so users can write code without considering IT operations and back-end infrastructure. The built-in APIs enable users to build different types of applications. Access to application logs also facilitates debugging and monitoring in production. Pay-per-use pricing.

What sort of application environment is provided by Google App Engine?

The App Engine standard environment is based on container instances running on Google's infrastructure. Containers are preconfigured with one of several available runtimes. The standard environment makes it easy to build and deploy an application that runs reliably even under heavy load and with large amounts of data.


1 Answers

Based on experience, there are a huge laundry list of strategies for App Engine optimization, the applicability of which depends on the nature of your apps. Here are some more tips that I know of:

  • For apps that serves a high amount of relatively static content, enabling the (as yet undocumented) edge caching could be a blessing to your weekly bills.

  • Even with concurrent requests/threadsafe enabled, each frontend instances could only process 8 (for Python) to 10 (Java, Go) simultaneous incoming request before the scheduler decides to spin up a new instance for you.

  • To counter the above restriction, I think there's a Google I/O video that recommends you to reduce the response time for any user-facing request going to the frontend instances to be ~100 ms.

  • To the tune of the above strategy, if you have any task that requires a large amount of processing or datastore I/O, offload the task to the push task queue, and immediately respond the request. You can specify the target of the task queue, but for this purpose it does not need to be the backend, frontend instances are good enough, and offer infinite scalability.

  • If you use the above strategy but still need to give the result of the processing or I/O to the user, use Channel API or any other messaging services to send the result back asynchronously.

  • Task queues are amazing stuff to distribute the workload of your app. You could customize its behavior in detail, and they are invaluable in making sure your app scales nicely. You can even have a two-way communication between instances using both push and pull queues.

I'll add more points later on.

like image 134
Ibrahim Arief Avatar answered Oct 21 '22 04:10

Ibrahim Arief