Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL cluster questions

I've read much of the MySQL Cluster docs and some tutorials yet I still have some things not clear, and the major of them right now are:

  1. When a data node restarts (crash and goes up again) will it's data still be available? Updates/Additions will work as usual? Will it "sync"?
  2. Does a cluster work faster than standalone? In particular, I update rows many times, but one at a time, meaning network latency might have impact on the performance. Is there any pattern I can follow to make things faster, such as adding more SQL nodes or adding mode data nodes?

Regarding question #2, an update of a row is in the following syntax:

UPDATE db_accounts.tbl_items SET items=items+%lld WHERE id_account=%u

"id_account" is an index (unique).

like image 228
Poni Avatar asked Mar 08 '10 13:03

Poni


2 Answers

MySQL Cluster is an in-memory database (although some columns can be stored on disk, indexed columns cannot). If the plug gets pulled, your data goes away. The recovery process for a node re-joining the cluster is that it pulls the data it lost from a surviving node (requiring good fast links between nodes) and then carefully applies replication events until it catches up and can actively participate. If a recent backup is available, it can be rebuilt from that instead of another node, but the principle is the same: the node has to be repopulated with data from scratch.

MySQL Cluster is basically a distributed hash table. The NDB node that holds a particular row of data is determined by a hash algorithm applied to the primary key. Performance increases by adding nodes, assuming that your data spreads nicely across the nodes. Performance can be badly affected if queries have to touch multiple nodes - ie. complex joins - but is lightning fast for retrieving a specific row given its primary key.

Obviously, given that the nodes are distributed, a slow or congested network will badly affect performance.

like image 129
Martin Avatar answered Nov 02 '22 11:11

Martin


Even if your MySQL Cluster table is in-memory, by default any writes are asynchronously checkpointed to disk (can turn this off on a per-table basis).

If the entire Cluster failed (power to the data center lost) then when you bring it back up the data will be retrieved from those disk checkpoints. Downside is that as they were created asynchronously you may be missing a handful of updates.

If a single data node fails then as well as recovering from its local disk copy, it catches up by applying the latest updates from its peer data node.

To add further high availability of your data you can use MySQL asynchronous replication to a second site (on the other side of the world if required).

like image 5
Andrew Morgan Avatar answered Nov 02 '22 09:11

Andrew Morgan