Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does "Document-oriented" vs. Key-Value mean when talking about MongoDB vs Cassandra?

People also ask

Is MongoDB document based or key-value based?

MongoDB as a key-value store MongoDB stores data in collections, which are a group of BSON (Binary JSON) documents where each document is essentially built from a field-value structure.

What does document-oriented vs key-value mean in the context of NoSQL?

The difference lies in the way the data is processed; in a key-value store, the data is considered to be inherently opaque to the database, whereas a document-oriented system relies on internal structure in the document in order to extract metadata that the database engine uses for further optimization.

What is the difference between document database and a key-value database?

Document databases organize documents into groups called collections, which are analogous to the tables in relational databases. By contrast, key-value databases store all key-value pairs together in a single namespace, which is analogous to a relational schema.

Is Cassandra document based?

With Cassandra, your data is stored in non-relational partitions just as you insert them—much like any other NoSQL platform would. MongoDB takes the NoSQL concept a step further by being document-based.


A key-value store provides the simplest possible data model and is exactly what the name suggests: it's a storage system that stores values indexed by a key. You're limited to query by key and the values are opaque, the store doesn't know anything about them. This allows very fast read and write operations (a simple disk access) and I see this model as a kind of non volatile cache (i.e. well suited if you need fast accesses by key to long-lived data).

A document-oriented database extends the previous model and values are stored in a structured format (a document, hence the name) that the database can understand. For example, a document could be a blog post and the comments and the tags stored in a denormalized way. Since the data are transparent, the store can do more work (like indexing fields of the document) and you're not limited to query by key. As I hinted, such databases allows to fetch an entire page's data with a single query and are well suited for content oriented applications (which is why big sites like Facebook or Amazon like them).

Other kinds of NoSQL databases include column-oriented stores, graph databases and even object databases. But this goes beyond the question.

See also

  • Comparing Document Databases to Key-Value Stores
  • Analysis of the NoSQL Landscape

Well, I've been investigating NoSQL myself the past month or so. I think it generally could be stated something like

  • KV stores doesnt know of the value content actually stored for a key
  • Document based lets you define secondary indexes within the value content, as the db knows the document structure (e.g. tags of a blog post).
  • NoSQL solutions each have specific features which should be taken into consideration, such as
    • Special datatypes in a KV store (e.g. sets with left/right pop/push like in redis)
    • easy scale up/down cluster as riak says it has (I havent tried it ... yet)
    • pluggable data store as in Voldemort
    • build-in web configuration and web app support like in CouchDB / couchapp