Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL Cluster (NDB) vs MySQL Replication (InnoDB) for Rails 3 apps: pros/cons?

We are doing an overview of our current systems, trying to figure out if we can improve performance & reliability.

Currently we run a bunch of internal Rails apps and our Rails based website. Some are Rails 3 already, some are being converted to Rails 3. They all connect to the following MySQL Setup.

mysql01 ( master server) => mysql02 (slave) => ( daily DB backups to a drive, that is backed up on a daily, weekly, monthly & semi-annual basis).

All writes happen on mysql01 and most short reads go to it as well, some "more resource consuming reads" ( like monthly/weekly reports that take 3-10 minutes to run and dump data into csv or backups) go to mysql02 server. We get about 3-5K visits per day to our site, and have about 20-30 internal users, that use various apps daily for inventory , order processing, etc. So these servers are not particularly under heavy loads other then those reports, that run of the slave anyways.

All servers run in a virtualized XEN pool on Debian Lenny VMs.

So we are doing a review of the systems, and somebody threw a suggestion of switching to MySQL Cluster (NDB) setup. I know of it in theory, but have never actually run it. So does anyone who had experience with it know of any pro / cons vs our current setup, and of any particular caveats when it involves Ruby / Rails applications?

like image 552
konung Avatar asked Mar 14 '11 15:03

konung


People also ask

What is MySQL cluster vs replication?

Unlike scaling out with MySQL replication Cluster allows you to scale writes just as well as reads. New data nodes or MySQL servers can be added to an existing Cluster with no loss of service to the application.

What is InnoDB and NDB?

One main difference is the use of NDB engine, not InnoDB, which is the default engine for MySQL. In NDB cluster, data is partitioned across multiple data nodes while Galera Cluster or MySQL InnoDB Cluster contain the full data set on each of the nodes.

What is NDB MySQL?

"NDB" stands for Network Database. From the MySQL Server perspective the NDB Cluster is a Storage engine for storing tables of rows. From the NDB Cluster perspective, a MySQL Server instance is an API process connected to the Cluster.

What is MySQL InnoDB cluster?

MySQL InnoDB Cluster provides a complete high availability solution for MySQL. By using AdminAPI, which is included with MySQL Shell, you can easily configure and administer a group of at least three MySQL server instances to function as an InnoDB Cluster.


1 Answers

There is a good comparison of InnoDB and MySQL Cluster (ndb) recently posted to the docs...worth taking a look: http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-compared.html

The Cluster architecture consists of a pool of MySQL Servers that are accessed by the application(s); these MySQL Servers don't actually store the Cluster data, the data is partitioned over the pool of data nodes below. Every MySQL Server has access to the data in all of the data nodes. If one MySQL server changes a piece of data then it is instantly visible to all of the other MySQL Servers.

Obviously, this architecture makes it extremely easy to scale out the database. Unlike sharding, the application doesn't need to know where the data is held - it can just load balance across all available MySQL Servers. Unlike scaling out with MySQL replication Cluster allows you to scale writes just as well as reads. New data nodes or MySQL servers can be added to an existing Cluster with no loss of service to the application.

MySQL Cluster's shared-nothing architecture means that it can deliver extremely high availability (99.999%+). Every time you change data, it is synchronously replicated to a second data node; if one data node fails then the applications read & write requests are automatically handled by the backup data node.

Due to the distributed nature of MySQL Cluster, some operations can be slower (for example JOINs that have thousands of interim results - though there is a prototype solution available which addresses this) but others can be very fast and can scale extremely well (e.g. primary key reads and writes). You have the option of storing tables (or even columns) in memory or on disk and by choosing the memory option (with changes checkpointed to disk in the backgoround) transactions can be very quick.

MySQL Cluster can be more complex to set up than a single MySQL server but it can prevent you having to implement sharding or read/write splitting in your application. Swings and roundabouts.

To get the best performance and scalability out of MySQL Cluster you need may need to tweak your application (see Cluster performance tuning white paper: http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster_perfomance.php). If you own the application this isn't normally a big deal but if you're using someone else's application that you can't modify then it could be a problem.

A final note is that it doesn't need to be all or nothing - you can choose to store some of your tables in Cluster and some using other storage engines, this is a per-table option. Also you can replicate between Cluster and other storage engines (for example, use Cluster for your run-time database and then replicate to InnoDB to generate complex reports).

like image 110
Mat Keep Avatar answered Sep 28 '22 11:09

Mat Keep