Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra is much slower than Mysql for simple operations?

I see a lot of statements like: "Cassandra very fast on writes", "Cassandra has reads really slower than writes, but much faster than Mysql"

On my windows7 system: I installed Mysql of default configuration. I installed PHP5 of default configuration. I installed Casssandra of default configuration.

Making simple write test on mysql: "INSERT INTO wp_test (id,title) VALUES ('id01','test')" gives me result: 0.0002(s) For 1000 inserts: 0.1106(s)

Making simple same write test on Cassandra: $column_faily->insert('id01',array('title'=>'test')) gives me result of: 0.005(s) For 1000 inserts: 1.047(s)

For reads tests i also got that Cassandra is much slower than mysql.

So the question, does this sounds correct that i have 5ms for one write operation on Cassadra? Or something is wrong and should be at least 0.5ms.

like image 216
Ivan Gusev Avatar asked Jan 13 '12 12:01

Ivan Gusev


People also ask

Why is Cassandra faster than MySQL?

Both horizontal and vertical scalability is an option, as Cassandra uses a linear model for faster responses. Along with scalability, the data storage is flexible. Because it is a NoSQL database, it can deal with structured, unstructured, or semi-structured data. In the same way, the data distribution is flexible.

Why is Cassandra fast?

The write is also replicated to multiple other nodes, so if one node loses its Memtable data, there are mechanisms in place for eventual consistency. Writing to in-memory data structure is much faster than writing to disk. Because of this, Cassandra writes are extremely fast!

Is Cassandra faster than RDBMS?

Major reason behind Cassandra's extremely faster writes is its storage engine. Cassandra uses Log-structured merge trees, whereas traditional RDBMS uses B+ Trees as underlying data structure. If you notice "B", you will find that Oracle just like MySQL has to read before write.

Is Cassandra better than SQL?

As Cassandra facilitates automatic distribution of data, it allows relatively fast data transfer to or from the storage. As SQL allows manual distribution of data, the speed of data transfer, in this case, is relatively slow than of Cassandra.


1 Answers

When people say "Cassandra is faster than MySQL", they mean when you are dealing with terabytes of data and many simultaneous users. Cassandra (and many distributed NoSQL databases) is optimized for hundreds of simultaneous readers and writers on many nodes, as opposed to MySQL (and other relational DBs) which are optimized to be really fast on a single node, but tend to fall to pieces when you try to scale them across multiple nodes. There is a generalization of this trade-off by the way- the absolute fastest disk I/O is plain old UNIX flat files, and many latency-sensitive financial applications use them for that reason.

If you are building the next Facebook, you want something like Cassandra because a single MySQL box is never going to stand up to the punishment of thousands of simultaneous reads and writes, whereas with Cassandra you can scale out to hundreds of data nodes and handle that load easily. See scaling up vs. scaling out.

Another use case is when you need to apply a lot of batch processing power to terabytes or petabytes of data. Cassandra or HBase are great because they are integrated with MapReduce, allowing you to run your processing on the data nodes. With MySQL, you'd need to extract the data and spray it out across a grid of processing nodes, which would consume a lot of network bandwidth and entail a lot of unneeded complication.

like image 114
Chris Shain Avatar answered Nov 02 '22 22:11

Chris Shain