Say you have a 4-node J2EE application server cluster, all running instances of a Hibernate application. How does caching work in this situation? Does it do any good at all? Should it simply be turned off?
It seems to me that data on one particular node would quickly become stale, as other users hitting other nodes make changes to database data. In such a situation, how could Hibernate ever trust that its cache is up to date?
First of all, you should clarify what cache you're talking about, Hibernate has 3 of them (the first-level cache aka session cache, the second-level cache aka global cache and the query cache that relies on the second-level cache). I guess the question is about the second-level cache so this is what I'm going to cover.
How does caching work in this situation?
If you want to cache read only data, there is no particular problem. If you want to cache read/write data, you need a cluster-safe cache implementation (via invalidation or replication).
Does it do any good at all?
It depends on a lot of things: the cache implementation, the frequency of updates, the granularity of cache regions, etc.
Should it simply be turned off?
Second-level caching is actually disabled by default. Turn it on if you want to use it.
It seems to me that data on one particular node would become stale quickly as other users hitting other nodes make changes to database data.
Which is why you need a cluster-safe cache implementation.
In such a situation, how could Hibernate ever trust that its cache is up to date?
Simple: Hibernate trusts the cache implementation which has to offer a mechanism to guarantee that the cache of a given node is not out of date. The most common mechanism is synchronous invalidation: when an entity is updated, the updated cache sends a notification to the other members of the cluster telling them that the entity has been modified. Upon receipt of this message, the other nodes will remove this data from their local cache, if it is stored there.
First of all, there are 2 caches in Hibernate. There is the first level cache, which you cannot remove, and is called Hibernate session. Then, there is the second level cache which is optional and pluggable (e.g Ehcache). It works accross many requests and, most probably, it's the cache you are referring to.
If you work on a clustered environment, then you need a 2nd level cache which can replicate changes accross the members of the cluster. Ehcache can do that. Caching is a hard topic and you need a deep understanding in order to use it without introducing other problems. Caching in a clustered environment is slightly more difficult.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With