I have a requirement where i need my database to store the following data:
- For each build, store the results of 3 performance runs. The result includes tps and latency.
Reading up on cassandra data model, this directly maps to a super column family of the following format:
BenchmarkSuperColumnFamily= {
build_1: {
Run1: {1000K, 0.5ms}
Run2: {1000K, 0.5ms}
Run3: {1000K, 0.5ms}
}
build_2: {
Run1: {1000K, 0.5ms}
Run2: {1000K, 0.5ms}
Run3: {1000K, 0.5ms}
}
...
}
But, i read in the following answer that the use of Super Column family is discouraged. I wanted to know if there is a better way of creating a model for my requirement.
PS, I borrowed the JSONish notation from the following article
A column family can contain super columns or columns. A super column is a dictionary, it is a column that contains other columns (but not other super columns). A column is a tuple of name, value and timestamp (I'll ignore the timestamp and treat it as a key/value pair from now on).
Cassandra is a NoSQL database, which is a key-value store. Some of the features of Cassandra data model are as follows: Data in Cassandra is stored as a set of rows that are organized into tables. Tables are also called column families.
In analogy with relational databases, a column family is as a "table", each key-value pair being a "row". Each column is a tuple (triplet) consisting of a column name, a value, and a timestamp. In a relational database table, this data would be grouped together within a table with other non-related data.
create column family User with comparator = UTF8Type and default_validation_class = UTF8Type and column_metadata = [ {column_name : name, validation_class : UTF8Type} {column_name : username, validation_class : UTF8Type} {column_name : City, validation_class : UTF8Type} ];
The StackOverflow answer that you linked to is correct. You shouldn't be using SuperColumns in new applications. They exist for backwards compatibility however.
In general, composite columns can be used to mimic any model provided by super columns. Basically, they allow you to separate your column names into multiple parts. So if you were to specify a comparator of 'CompositeType(UTF8Type, UTF8Type)', your data model wound end up looking like this:
BenchmarkColumnFamily= {
build_1: {
(Run1, TPS) : 1000K
(Run1, Latency) : 0.5ms
(Run2, TPS) : 1000K
(Run2, Latency) : 0.5ms
(Run3, TPS) : 1000K
(Run3, Latency) : 0.5ms
}
build_2: {
...
}
...
}
With the above model you could use a single query to get a single data point for a single run, all data points for a single run, or all data points for multiple runs.
More info on composite columns: http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With