Is Cassandra suitable to use as a primary data store?

Question

I'm evaluating a storage platform for an upcoming project and keep coming back to Cassandra. For this project loosing any amount of data is unacceptable. So far we've used a relational database (Microsoft SQL Server), but the data is so varied and large that it has become an issue to store and query.

Is Cassandra robust enough to use as a primary data store? Or should it only be used to mirror existing data to speed up access?

jbellis · Accepted Answer

Anecdotally: yes, Twitter, Digg, Ooyala, SimpleGeo, Mahalo, and others are using or moving to Cassandra for a primary data store (http://n2.nabble.com/Cassandra-users-survey-td4040068.html).

Technically: yes; besides supporting replication (including to multiple datacenters), each Cassandra node has an fsync'd commit log to make sure writes are durable; from there writes are turned into SSTables which are immutable until compaction (which combines multiple SSTables to GC old versions). Snapshotting is supported at any time, including automatic snapshot-before-compaction.

Irfan · Answer

Whether to use Cassandra for your application or not depends purely on your data workloads. Cassandra is optimised for write-intensive workloads, therefore, it is suitable for applications where a large amount of data needs to be inserted (such as infrastructure logging information at Facebook).

If however, you require fast retrievals and insertion speed is not an issue, then perhaps you should have a look at say HBase (which is optimised of read-intensive workloads).

Is Cassandra suitable to use as a primary data store?

Tags:

nosql

cassandra

John Clayton

2 Answers

jbellis

Irfan

Recent Activity

Donate For Us

Is Cassandra suitable to use as a primary data store?

Tags:

nosql

cassandra

John Clayton

2 Answers

jbellis

Irfan

Related questions

Recent Activity

Donate For Us