Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you call the data model of DynamoDB and Cassandra?

The DynamoDB Wikipedia article says that DynamoDB is a "key-value" database. However, calling it a "key-value" database completely misses an extremely fundamental feature of DynamoDB, that of the sort key: Keys have two parts (partition key and sort key) and items with the same partition key can be efficiently retrieved together sorted by the sort key.

Cassandra also has exactly the same sorting-items-inside-a-partition feature (which it calls "clustering key"), and the Cassandra Wikipedia article uses the term wide column store to describe it. However, while this term "wide column" is better than "key-value", it is still somewhat inappropriate because it describes the more general situation where an item can have a very large number of unrelated columns - not necessarily a sorted list of separate items.

So my question is whether there is a more appropriate term that can describe the data model of a database like DynamoDB and Cassandra - databases which like a key-value store can efficiently retrieve items for individual keys, but can also efficiently retrieve items sorted by the key or just a part of it (DynamoDB's sort key or Cassandra's clustering key).

like image 292
Nadav Har'El Avatar asked Mar 22 '20 10:03

Nadav Har'El


People also ask

Is DynamoDB based on Cassandra?

Cassandra and DynamoDB both origin from the same paper: Dynamo: Amazon's Highly Available Key-value store. A chunk of the differences between Cassandra & Dynamo stems from the fact that the data-model of Dynamo is a key-value store. At the same time, Cassandra is designed as a column-family data store.

What is the data model of DynamoDB?

A DynamoDB table design corresponds to the relational order entry schema that is shown in Relational modeling. It follows the Adjacency list design pattern, which is a common way to represent relational data structures in DynamoDB.

Is DynamoDB the same as Cassandra?

Apache Cassandra is a column-oriented data store, whereas Amazon DynamoDB is a key-value and document-oriented store. Although DynamoDB can store a wide range of data types, Cassandra's list of supported data types is greater. For example, it contains data types such as counter, duration, inet, and varint.

What type of data model does Cassandra use?

Cassandra is a NoSQL database, which is a key-value store. Some of the features of Cassandra data model are as follows: Data in Cassandra is stored as a set of rows that are organized into tables. Tables are also called column families.


1 Answers

Before CQL was introduced, Cassandra adhered more strictly the wide column store data model, where you only had rows identified by a row key and containing sorted key/value columns. With the introduction of CQL, rows became known as partitions and columns could optionally be grouped in to logical rows via clustering keys.

Even until Cassandra 3.0, CQL was simply an abstraction on top of the original thrift data model and there was no concept of CQL rows within the storage engine. They were just a sorted set of columns with a compound key consisting of the concatenated values of the clustering keys. More details are given in this article. Now there is native support for CQL in the storage engine, which allows CQL data models to be stored more efficiently.

However, if you think of a CQL row as a logical grouping of columns within the same partition, Cassandra still could be considered a wide column store. In any case, there isn't, to my knowledge, another well established term to describe this kind of database.

like image 88
J.B. Langston Avatar answered Oct 05 '22 19:10

J.B. Langston