Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cluster vs replication

I have an use case where I am looking to replicate a single database on multiple servers (for HA and scalability purposes),

Would there be any disadvantage to run a 3 node replica instead of a 3 nodes cluster ?

like image 264
romainrbr Avatar asked Apr 03 '17 20:04

romainrbr


People also ask

What's the difference between replication and clustering?

From my understanding a cluster is a set of servers or nodes. While a replica set is a set of servers or nodes all of which has replication mechanism built into each of them for downtime access and faster read operation.

What is MySQL cluster vs replication?

Unlike scaling out with MySQL replication Cluster allows you to scale writes just as well as reads. New data nodes or MySQL servers can be added to an existing Cluster with no loss of service to the application.

What is cluster replication?

The management and configuration is similar to server-to-server replication. You will configure these computers and storage in a cluster-to-cluster configuration, where one cluster replicates its own set of storage with another cluster and its set of storage.


2 Answers

Couchdb docs 11.2 provides an example cluster configuration of:

[cluster]
  q=8
  r=2
  w=2
  n=3

q - The number of shards.

r - The number of copies of a document with the same revision that have to be read before CouchDB returns with a 200 and the document. If there is only one copy of the document accessible, then that is returned with 200.

w - The number of nodes that need to save a document before a write is returned with 201. If the nodes saving the document is 0, 202 is returned.

n - The number of copies there is of every document. Replicas.

The behavior of your 3 part replica should be equivalent to:

[cluster]
  q=1
  r=1
  w=1
  n=3

when replicating correctly. This is a possible configuration of clustering, but not an optimal as it lacks:

  • the benefit of confirmation that multiple nodes and a majority of nodes have confirmed a save before it is acknowledged.

  • the benefit of confirmation that multiple nodes and a majority of nodes have confirmed a revision is correct before it is returned.

  • Expandability of the database beyond a single node's storage via sharding.

  • The ability to change to any configuration equivalent to cluster parameters with q, r or w > 1 without switching to a cluster.

Indirectly, the limits on acknowledgements make more potential conflicts to resolve between the replicas if the replicas are actually used for network scalability, and greater odds an actual inconsistency in the form of lost records if a node fails between acknowledging a save and passing it on to the other replicas.

like image 68
lossleader Avatar answered Oct 19 '22 15:10

lossleader


Which version of CouchDB will you be using? If 2.0.0+, there's probably no reason not to use true clustering.

The only reason I can think of to use replicas instead of clustering would be for ease of configuration, or because your db (i.e. CouchDB < 2.0.0) doesn't support it.

But if you use clustering, even on just 3 nodes now, you're already set up for greater expansion later, just by adding more nodes.

Is there a reason you might not want to use a cluster?

like image 39
Flimzy Avatar answered Oct 19 '22 16:10

Flimzy