Which is the better NoSQL database for most applications?
Both Cassandra (0.7x) and Membase:
Cassandra has selectable per query durability/consistency guarantees
Cassandra has BigTable column support
Membase has asynchronous (immediate return) writes
Beyond the consistency guarantees why would you choose one over the other?
Though Facebook has all but abandoned Cassandra, the technology has gone on to power critical web infrastructure at companies like Twitter, Netflix, even Apple. And DataStax has built a version of the tool for all sorts of other businesses.
Cassandra doesn't support a relational schema with foreign keys and join tables. So if you want to write a lot of complex join queries, then Cassandra might not be the right database for you.
Cassandra's largest production deployments include Apple, with over 160,000 instances and 100 petabytes of data across 1,000+ clusters, Huawei, with over 30,000 instances across 300+ clusters. And Netflix, with over 10,000 instances and 6 petabytes across 100+ clusters, and over 1 trillion requests per day.
Cassandra uses a synthesis of well known techniques to achieve scalability and availability. Cassandra was designed to fulfill the storage needs of the Inbox Search problem. In- box Search is a feature that enables users to search through their Facebook Inbox.
Cassandra offers rows broken up into columns that can be indexed, efficiently updated independently (instead of having to re-write the whole row/object), and used as materialized views (unlike relational rows, cassandra column names can be determined dynamically at runtime).
Cassandra offers fully multi-master replication across multiple datacenters, configurable per-keyspace. (E.g., I want 3 copies of data set X in north america datacenter and 1 copy in europe. But data set Y I want just 2 copies in north america.)
It's incorrect to say that "Cassandra is geared more towards writes than reads." The difference is that both are very fast with Cassandra, unlike most systems that are only fast at reads.
FWIW, Cassandra used to offer asynchronous writes, but we took it out because when you get to the limit of your capacity your choices are (1) running the server into the ground or (2) dropping requests with no feedback to the client that this is what happened. This isn't worth the very small performance increase.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With