Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Membase vs. Cassandra?

Which is the better NoSQL database for most applications?


Both Cassandra (0.7x) and Membase:

  • A Key Value Database
  • Are FAST
  • Horizontally scalable
  • May be coupled with Hadoop for Mapreduce processing
  • Support Increment and Decrement

Cassandra has selectable per query durability/consistency guarantees

Cassandra has BigTable column support

Membase has asynchronous (immediate return) writes


Beyond the consistency guarantees why would you choose one over the other?

like image 835
MartysMind Avatar asked Jan 10 '11 05:01

MartysMind


People also ask

Is Facebook still using Cassandra?

Though Facebook has all but abandoned Cassandra, the technology has gone on to power critical web infrastructure at companies like Twitter, Netflix, even Apple. And DataStax has built a version of the tool for all sorts of other businesses.

What is Cassandra not good for?

Cassandra doesn't support a relational schema with foreign keys and join tables. So if you want to write a lot of complex join queries, then Cassandra might not be the right database for you.

Does Apple use Cassandra?

Cassandra's largest production deployments include Apple, with over 160,000 instances and 100 petabytes of data across 1,000+ clusters, Huawei, with over 30,000 instances across 300+ clusters. And Netflix, with over 10,000 instances and 6 petabytes across 100+ clusters, and over 1 trillion requests per day.

Why does Facebook use Cassandra?

Cassandra uses a synthesis of well known techniques to achieve scalability and availability. Cassandra was designed to fulfill the storage needs of the Inbox Search problem. In- box Search is a feature that enables users to search through their Facebook Inbox.


1 Answers

Cassandra offers rows broken up into columns that can be indexed, efficiently updated independently (instead of having to re-write the whole row/object), and used as materialized views (unlike relational rows, cassandra column names can be determined dynamically at runtime).

Cassandra offers fully multi-master replication across multiple datacenters, configurable per-keyspace. (E.g., I want 3 copies of data set X in north america datacenter and 1 copy in europe. But data set Y I want just 2 copies in north america.)

It's incorrect to say that "Cassandra is geared more towards writes than reads." The difference is that both are very fast with Cassandra, unlike most systems that are only fast at reads.

FWIW, Cassandra used to offer asynchronous writes, but we took it out because when you get to the limit of your capacity your choices are (1) running the server into the ground or (2) dropping requests with no feedback to the client that this is what happened. This isn't worth the very small performance increase.

like image 165
jbellis Avatar answered Jan 01 '23 23:01

jbellis