I want to add a cache to a distributed application that uses a Galera cluster (mysql) as DB. Each application node (java) has a local cache for reads and comunicates with a DB node.
The problem is I don't know how to invalidate entries in cache when these are modified by another node in the cluster.
My first attempt was to use DB triggers but I soon realised that triggers aren't thrown by replication updates.
Other ideas are monitoring network traffic searching for modifications or watching binlogs of DB, but both of them seems very difficult to implement.
The question is: Is there any practical way to detect changes in one node coming from another in replication?
Other ideas in implementing cache invalidation?
I think I could also use a distributed cache that communicates changes through the application nodes but I'd prefer use an isolated cache in each app node and delegate data synchronization between nodes to the DB cluster. I see caches communication as redundant network traffic...
Thanks in advance.
To put it bluntly, you have over-engineered the system.
You have a Cluster Node 'near' each Java client, correct? That is, each datacenter has a Node plus one or more Java clients?
At that point there is very little need for an extra cache between Java and the data; simply re-request the data and let MySQL do the caching.
If you have sluggish queries, then let's discuss them and see if they can be sped up.
The setup I describe should be able to handles hundreds, possibly thousands of queries per second. Is your requirement stiffer than that?
See wsrep_sync_wait for dealing with causality checks.
The only way I can think of to detect changes using only MySQL is to parse the binary logs.
Since you asked for alternatives to that, you could use a messaging system (e.g. Apache ActiveMQ) to send cache invalidation messages between nodes so that each node can clear its own cache.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With