Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to prefer master-slave and when to cluster?

I know there have been many articles written about database replication. Trust me, I spent some time reading those articles including this SO one that explaints the pros and cons of replication. This SO article goes in depth about replication and clustering individually, but doesn't answer these simple questions that I have:

  1. When do you replicate your database, and when do you cluster?
  2. Can both be performed at the same time? If yes, what are the inspirations for each?

Thanks in advance.

like image 439
th3an0maly Avatar asked Jun 26 '16 09:06

th3an0maly


People also ask

When would you use a master-slave database?

The master-slave database architecture can be used for scaling out your application by distributing your data load across multiple databases.

Is master-slave is a cluster?

Clustering is different that master-slave in a very basic sense that not all the data of the cluster resides on one node. There are multiple master nodes where data will reside and each master has a different group of data which is called a shard. There are 16384 shards divided among the total number of masters.

What is difference between master master and master-slave replication?

Master-Master: This is similar to Master-Slave architecture, the only difference is that both the nodes are masters and replica at the same time i.e. there will be circular replication between the nodes.

Does NoSQL use master-slave?

Relational Databases mainly use Master-Slave Architecture and in contrast, in NoSQL Databases, there is no such master-slave or primary secondary concept.


1 Answers

MySQL currently supports two different solutions for creating a high availability environment and achieving multi-server scalability.

MySQL Replication

The first form is replication, which MySQL has supported since MySQL version 3.23. Replication in MySQL is currently implemented as an asyncronous master-slave setup that uses a logical log-shipping backend.

A master-slave setup means that one server is designated to act as the master. It is then required to receive all of the write queries. The master then executes and logs the queries, which is then shipped to the slave to execute and hence to keep the same data across all of the replication members.

Replication is asyncronous, which means that the slave server is not guaranteed to have the data when the master performs the change. Normally, replication will be as real-time as possible. However, there is no guarantee about the time required for the change to propagate to the slave.

Replication can be used for many reasons. Some of the more common reasons include scalibility, server failover, and for backup solutions.

Scalibility can be achieved due to the fact that you can now do can do SELECT queries across any of the slaves. Write statements however are not improved generally due to the fact that writes have to occur on each of the replication member.

Failover can be implemented fairly easily using an external monitoring utility that uses a heartbeat or similar mechanism to detect the failure of a master server. MySQL does not currently do automatic failover as the logic is generally very application dependent. Keep in mind that due to the fact that replication is asynchronous that it is possible that not all of the changes done on the master will have propagated to the slave.

MySQL replication works very well even across slower connections, and with connections that aren't continuous. It also is able to be used across different hardware and software platforms. It is possible to use replication with most storage engines including MyISAM and InnoDB.

MySQL Cluster

MySQL Cluster is a shared nothing, distributed, partitioning system that uses synchronous replication in order to maintain high availability and performance.

MySQL Cluster is implemented through a separate storage engine called NDB Cluster. This storage engine will automatically partition data across a number of data nodes. The automatic partitioning of data allows for parallelization of queries that are executed. Both reads and writes can be scaled in this fashion since the writes can be distributed across many nodes.

Internally, MySQL Cluster also uses synchronous replication in order to remove any single point of failure from the system. Since two or more nodes are always guaranteed to have the data fragment, at least one node can fail without any impact on running transactions. Failure detection is automatically handled with the dead node being removed transparent to the application. Upon node restart, it will automatically be re-integrated into the cluster and begin handling requests as soon as possible.

There are a number of limitations that currently exist and have to be kept in mind while deciding if MySQL Cluster is the correct solution for your situation.

Currently all of the data and indexes stored in MySQL Cluster are stored in main memory across the cluster. This does restrict the size of the database based on the systems used in the cluster.

MySQL Cluster is designed to be used on an internal network as latency is very important for response time.

As a result, it is not possible to run a single cluster across a wide geographic distance. In addition, while MySQL Cluster will work over commodity network setups, in order to attain the highest performance possible special clustering interconnects can be used.

like image 178
Andrews B Anthony Avatar answered Dec 28 '22 06:12

Andrews B Anthony