I have been recently looking at nosql solutions for our quite big upcoming database and found that cassandra is good but there are very less resources available online about new releases of cassandra and most of the blogs and articles are related to 0.6 version while now it has also implemented support for hadoop and hive. While on the other hand mysql cluster version is also specifically made to run on horizontal scaled setup using commodity servers.
As we are used to relational model for years and moving to cassandra will need decompiling of brain while the product is still not very mature and community is not also that big to respond quickly to any particular problem I have checked datastax(on of the professional support providers) website and their forums are pretty much dead.
So, how to compare mysql cluster vs cassandra while putting relational and non-relational comparison put aside?
Though cassandra is schema less but still it provies pretty much tabular features like super colum and sub column too so record can be searched from multiple column values.
I have also tried my best to find out how cassandra physically stores updated queries like for a row when a sub column is edited and added quite a big chunk of data then how it physically stores that record and how it accesses that record fast? Because in mysql columns have fixed length allocated so its not a big issue.
Cassandra is designed to remain available if one of it's nodes is down or unreachable. However, when a node is down or unreachable, it needs to eventually discover the writes it missed.
As we said earlier, each instance of Cassandra has evolved to contain 256 virtual nodes. The Cassandra server runs core processes. For example, processes like spreading replicas around nodes or routing requests.
It is designed for mission critical applications. MySQL Cluster has replication between clusters across multiple geographical sites built-in. A shared nothing architecture with data locality awareness make it the perfect choice for running on commodity hardware and in globally distributed cloud infrastructure.
In Cassandra, the data itself is automatically distributed, with (positive) performance consequences. It accomplishes this using partitions. Each node owns a particular set of tokens, and Cassandra distributes data based on the ranges of these tokens across the cluster.
Here are some areas where I suspect Cassandra has an advantage:
To elaborate on the last a little, most people who haven't actually run Cassandra on a multi-node cluster, don't realize just how well Cassandra has been designed for this. For a two minute taste, see Jake Luciani's demo.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With