Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scaling and Clustering JPA

I am putting together a regular Java EE application on jboss7 that will use JPA in the data tier. I would like to make this application such that it scales up with load. While it is pretty clear how to scale up the web tier: create more machines and throw them behind a load balancer, scaling up the data tier is less so.

I can probably cluster my database (MySQL). Stil, that leaves the JPA layer unclustered. Ideally, JPA will scale up by using in (clustered) memory caching backed by MySQL.

When I look around, all information around JPA scaling seems to be 3-4 years old. People talk about ehcache, memcached and infinispan. I am not sure if this is still current.

Can someone tell me the state of the art in Java EE clustering and scaling, especially in the data tier.

like image 747
Raj Avatar asked Apr 26 '12 06:04

Raj


2 Answers

Various caching strategies are still the way to scale JPA/Hibernate (you basically named the most popular options in your question). Nothing extraordinary happend since 4-5 years in this field, as far as I know. One more option you haven't mentioned is JBoss Cache. So the Second Level Cache for JPA/Hibernate still rules in this area.

Why no progress here? My wild guess is that first of all people, who need scalable application tend to ignore JPA and Hibernate in areas where high performance is needed. Usually people go with SQL dressed in Spring Framework JDBCTemplate helpers and transaction management. Then scalability is the matter of database capabilities in this area.

The other trend is to use No-SQL databases. There is plany of solutions: MongoDB, CouchoDB, Cassandra, Redis, to name a few. These are usually Google BigTable like key-value storages (this is oversimplification, but it is more or less the idea behind that approach) and they scale as hell, if you accept their limitations (relations are no longer managed easily, etc.).

like image 116
Piotr Kochański Avatar answered Oct 23 '22 08:10

Piotr Kochański


There are many solutions, the two main categories of solutions are:

  • scaling the database
  • using a clustered cache to reduce database load

EclipseLink supports data partitioning for sharding data across a set of database instances,

see: http://java-persistence-performance.blogspot.com/2011/05/data-partitioning-scaling-database.html

You can also use MySQL Cluster,

see: http://www.mysql.com/products/cluster/

Oracle TopLink Grid provides EclipseLink JPA support for integration with Oracle Coherence as a distributed cache,

see: http://www.oracle.com/technetwork/middleware/ias/tl-grid-097210.html

EclipseLink's cache supports clustering through cache coordination,

see: http://wiki.eclipse.org/EclipseLink/Examples/JPA/CacheCoordination

like image 29
James Avatar answered Oct 23 '22 08:10

James