Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do Singletons in Google App Engine (or more generally in a distributed server environment) work?

I am intrigued as to how singletons work in Google App Engine (or any distributed server environment). Given your application can be running in multiple processes (on multiple machines) at once, and requests can get routed all off the place, what actually happens under the hood when an app does something like: 'CacheManager.getInstance()'?

I'm just using the (GAE) CacheManager as an example, but my point is, there is a single global application instance of a singleton somewhere, so where does it live? Is an RPC invoked? In fact, how is global application state (like sessions) actually handled generally?

Regards, Shane

like image 405
Shane Avatar asked Jul 26 '09 01:07

Shane


People also ask

Which programming environment is used for Google App Engine?

Google App Engine provides four possible runtime environments for applications, one for each of four programming languages: Java, Python, PHP, and Go.

What is Google App Engine in cloud computing?

Google App Engine (GAE) is a platform-as-a-service product that provides web app developers and enterprises with access to Google's scalable hosting and tier 1 internet service. GAE requires that applications be written in Java or Python, store data in Google Bigtable and use the Google query language.

How do I use Google App Engine locally?

Running your application locallySelect File > Open to open the project you want to run. Browse to the directory containing your project. Select Tools > Cloud Code > App Engine Run on a local App Engine Standard dev server.


2 Answers

The singletons in App Engine Java are per-runtime, not per-webapp. Their purpose is simply to provide a single point of access to the underlying service (which in the case of both Memcache and Users API, is accessed via an RPC), but that's purely a design pattern for the library - there's no per-app singleton anywhere that these methods access.

like image 135
Nick Johnson Avatar answered Oct 09 '22 23:10

Nick Johnson


Caches are generally linked up with some sort of distributed replicated cache. For example, GAE uses a custom version of memcached to handle maintaining a shared cache of objects across a cluster, while maintaining the storage state in a consistent state. In general there are lots of solutions for this problem with lots of different tradeoffs to be made in terms of performance and cache coherence (eg, is it critical that all caches match 100% of the time, must the cache be written to disk to protect against loss, etc).

Here are some sample products with distributed caching features (most have documentation describing the tradeoffs of various approaches in great detail:

  • memcached - C with lots of client APIs and language ports
  • Ehcache - OSS Java cache, with widespread adoption
  • JBoss Cache - Another popular Java OSS solution
  • Oracle Coherence (formerly Tangosol Coherence) - Probably the best known Java commercial cache.
  • Indexus Cache - A popular .Net OSS solution
  • NCache - Likely the most popular .Net commercial caching solution

As you can see, there have been many projects that have approached this problem. One possible solution is to simply share a single cache on a single machine, however, most projects make some sort of replication and distributed failover possible.

like image 30
jsight Avatar answered Oct 09 '22 23:10

jsight