Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hazelcast: Questions regarding multi-node consistency

(I could not find a good source explaining this, so if it is available elsewhere, you could just point me to it)

  1. Hazelcast replicates data across all nodes in clusters. So, if data is changed in one of the nodes, does the node update its own copy and then propagate it to other nodes?

  2. I read somewhere that each data is owned by a node, how does Hazelcast determine the owner? Is the owner determined per datastructure or per key in the datastructure?

  3. Does Hazelcast follow "eventually consistent" principle? (When the data is being propagated across the nodes, there could be a small window during which the data might be inconsistent between the nodes)

  4. How are conflicts handled? (Two nodes update the same key-value simultaneously)

like image 680
gammay Avatar asked Jun 03 '15 14:06

gammay


People also ask

Is Hazelcast consistent?

Hazelcast now uniquely provides both a Consistency (CP) Subsystem for sensitive concurrency structures which favors consistency over availability and an Availability (AP) Subsystem for data storage structures which favor availability over consistency.

What is split brain in Hazelcast?

Split-brain protection mechanism provided in Hazelcast protects your cluster in case the number of cluster members drops below the specified one. How to respond to a split-brain scenario depends on whether consistency of data or availability of your application is of primary concern.

How does Hazelcast store data?

Data in Hazelcast is usually stored in-memory (RAM) so that it's faster to access. However, data in RAM is volatile, meaning that when one or more members shut down, their data is lost. When you persist data on disk, members can load it upon a restart and continue to operate as usual.

How does Hazelcast cluster work?

Hazelcast is designed to scale up to hundreds and thousands of members. Simply add new members; they automatically discover the cluster and linearly increase both the memory and processing capacity. The members maintain a TCP connection between each other and all communication is performed through this layer.


1 Answers

  1. Hazelcast does not replicate (with exception of the ReplicatedMap, obviously ;-)) but partitions data. That means you have one node that owns a given key. All updates to that key will go to the owner and he notifies possible updates.

  2. The owner is determined by consistent hashing using the following formula:

partitionId = hash(serialize(key)) % partitionCount

  1. Since there is only one owner per key it is not eventually consistent but consistent whenever the mutating operations is returned. All following read operations will see the new value. Under normal operational circumstances. When any kind of failure happens (network, host, ...) we choose availability over consistency and it might happen that a not yet updated backup is reactivated (especially if you use async backups).

  2. Conflicts can happen after split-brain when the split cluster re-merge. For this case you have to configure (or use the default one) MergePolicy to define the behavior on how conflicting elements are merged together or which one of both wins.

like image 162
noctarius Avatar answered Oct 18 '22 02:10

noctarius