I'm new to NoSQL; so, I'm trying to understand some of the Cassandra concepts that I can't really get from the dozens of sources that I have studied. <ol> <li>Should I consider wide row, and dynamic columns as synonyms; or are they 2 different concepts?</li> <li>Am I correct in thinking of columns of collection types as wide row?</li> <li>It seems to me that wide row is a concept from earlier versions of Cassandra, and can be created only through Thrift API; whereas collection types are the modern versions of wide rows.</li> <li>Are collection types still limited to 64k elements? Or that after CQL 3, that limitation has been removed?</li> </ol>

<blockquote> A common misunderstanding is that CQL does not support dynamic columns or wide rows. On the contrary, CQL was designed to support everything you can do with the Thrift model, but make it easier and more accessible. </blockquote> Let's take a look at the below cql table. <pre class="prettyprint"><code>CREATE TABLE data ( sensor_id int, collected_at timestamp, volts float, PRIMARY KEY (sensor_id, collected_at) ); </code></pre> And insert some data <pre class="prettyprint"><code>sensor_id | collected_at | volts ----------+--------------------------+------- 1 | 2013-06-05 15:11:00-0500 | 3.1 1 | 2013-06-05 15:11:10-0500 | 4.3 1 | 2013-06-05 15:11:20-0500 | 5.7 2 | 2013-06-05 15:11:00-0500 | 3.2 3 | 2013-06-05 15:11:00-0500 | 3.3 3 | 2013-06-05 15:11:10-0500 | 4.3 </code></pre> Here clustering column <code>collected_at</code> is similar to Thrift dynamic column.(Q.1) If we look at the internal structure of this table <pre class="prettyprint"><code>RowKey: 1 => (cell=2013-06-05 15:11:00-0500, value=3.1, timestamp=1370463146717000) => (cell=2013-06-05 15:11:10-0500, value=4.3, timestamp=1370463282090000) => (cell=2013-06-05 15:11:20-0500, value=5.7, timestamp=1370463282093000) ------------------- RowKey: 2 => (cell=2013-06-05 15:11:00-0500, value=3.2, timestamp=1370463332361000) ------------------- RowKey: 3 => (cell=2013-06-05 15:11:00-0500, value=3.3, timestamp=1370463332365000) => (cell=2013-06-05 15:11:10-0500, value=4.3, timestamp=1370463332368000) </code></pre> You can see that the clustering column <code>collected_at</code> makes this table table wide row (Q.1). So we can say that if a table have one or more clustering key, we can called that table wide row. Let's take another example : <pre class="prettyprint"><code>CREATE TABLE example ( key1 text PRIMARY KEY, map1 map<text,text>, list1 list<text>, set1 set<text> ); </code></pre> Insert a data : <pre class="prettyprint"><code> key1 | list1 | map1 | set1 ------+-------------------+----------------------------------------------+----------------------- john | ['doug', 'scott'] | {'doug': '555-1579', 'patricia': '555-4326'} | {'patricia', 'scott'} </code></pre> Now take a look at the internal structure : <pre class="prettyprint"><code>RowKey: john => (column=, value=, timestamp=1374683971220000) => (column=map1:doug, value='555-1579', timestamp=1374683971220000) => (column=map1:patricia, value='555-4326', timestamp=1374683971220000) => (column=list1:26017c10f48711e2801fdf9895e5d0f8, value='doug', timestamp=1374683971220000) => (column=list1:26017c12f48711e2801fdf9895e5d0f8, value='scott', timestamp=1374683971220000) => (column=set1:'patricia', value=, timestamp=1374683971220000) => (column=set1:'scott', value=, timestamp=1374683971220000) </code></pre> You can see that map key and set value stored as dynamic column and map value and list value stored as the value of that column. It's similar to wide row (Q.2) And the last one : Collection type map key and set size is limited to 64k. <ul> <li>Collection (List): collection limit: ~2 billion (2^31); values size: 65535 (216-1)</li> <li>Collection (Set): collection limit: ~2 billion (2^31); values size: 65535 (216-1)</li> <li>Collection (Map): collection limit: ~2 billion (2^31); number of keys: 65535 (216-1); values size: 65535 (216-1)</li> </ul> Source : https://www.datastax.com/blog/2013/06/does-cql-support-dynamic-columns-wide-rows https://teddyma.gitbooks.io/learncassandra/content/model/cql_and_data_structure.html http://docs.datastax.com/en/cql/3.3/cql/cql_reference/refLimits.html

Cassandra Wide Row/Dynamic Columns

1 Answers

A common misunderstanding is that CQL does not support dynamic columns or wide rows. On the contrary, CQL was designed to support everything you can do with the Thrift model, but make it easier and more accessible.

Let's take a look at the below cql table.

CREATE TABLE data (
  sensor_id int,
  collected_at timestamp,
  volts float,
  PRIMARY KEY (sensor_id, collected_at)
);

And insert some data

sensor_id | collected_at             | volts
----------+--------------------------+-------
   1      | 2013-06-05 15:11:00-0500 |   3.1
   1      | 2013-06-05 15:11:10-0500 |   4.3
   1      | 2013-06-05 15:11:20-0500 |   5.7
   2      | 2013-06-05 15:11:00-0500 |   3.2
   3      | 2013-06-05 15:11:00-0500 |   3.3
   3      | 2013-06-05 15:11:10-0500 |   4.3

Here clustering column collected_at is similar to Thrift dynamic column.(Q.1)

If we look at the internal structure of this table

RowKey: 1
=> (cell=2013-06-05 15:11:00-0500, value=3.1, timestamp=1370463146717000)
=> (cell=2013-06-05 15:11:10-0500, value=4.3, timestamp=1370463282090000)
=> (cell=2013-06-05 15:11:20-0500, value=5.7, timestamp=1370463282093000)
-------------------
RowKey: 2
=> (cell=2013-06-05 15:11:00-0500, value=3.2, timestamp=1370463332361000)
-------------------
RowKey: 3
=> (cell=2013-06-05 15:11:00-0500, value=3.3, timestamp=1370463332365000)
=> (cell=2013-06-05 15:11:10-0500, value=4.3, timestamp=1370463332368000)

You can see that the clustering column collected_at makes this table table wide row (Q.1).

So we can say that if a table have one or more clustering key, we can called that table wide row.

Let's take another example :

CREATE TABLE example (
    key1 text PRIMARY KEY,
    map1 map<text,text>,
    list1 list<text>,
    set1 set<text>
);

Insert a data :

 key1 | list1             | map1                                         | set1
------+-------------------+----------------------------------------------+-----------------------
 john | ['doug', 'scott'] | {'doug': '555-1579', 'patricia': '555-4326'} | {'patricia', 'scott'}

Now take a look at the internal structure :

RowKey: john
=> (column=, value=, timestamp=1374683971220000)
=> (column=map1:doug, value='555-1579', timestamp=1374683971220000)
=> (column=map1:patricia, value='555-4326', timestamp=1374683971220000)
=> (column=list1:26017c10f48711e2801fdf9895e5d0f8, value='doug', timestamp=1374683971220000)
=> (column=list1:26017c12f48711e2801fdf9895e5d0f8, value='scott', timestamp=1374683971220000)
=> (column=set1:'patricia', value=, timestamp=1374683971220000)
=> (column=set1:'scott', value=, timestamp=1374683971220000)

You can see that map key and set value stored as dynamic column and map value and list value stored as the value of that column. It's similar to wide row (Q.2)

And the last one : Collection type map key and set size is limited to 64k.

Collection (List): collection limit: ~2 billion (2^31); values size: 65535 (216-1)
Collection (Set): collection limit: ~2 billion (2^31); values size: 65535 (216-1)
Collection (Map): collection limit: ~2 billion (2^31); number of keys: 65535 (216-1); values size: 65535 (216-1)

Source :
https://www.datastax.com/blog/2013/06/does-cql-support-dynamic-columns-wide-rows https://teddyma.gitbooks.io/learncassandra/content/model/cql_and_data_structure.html http://docs.datastax.com/en/cql/3.3/cql/cql_reference/refLimits.html

125

answered Oct 30 '22 03:10

Ashraful Islam

Related questions
                            
                                Cassandra control SSTable size
                            
                                How is Cassandra designed to avoid the need for load balancers?
                            
                                How to use cql queries to get different datatypes out of cassandra with java client hector
                            
                                Cassandra Limit 10,20 clause
                            
                                Difference between Cassandra Row caching and Partition key caching
                            
                                cassandra copy data from one columnfamily to another columnfamily
                            
                                How Cassandra select the node to send request?
                            
                                CQLSH: Converting unix timestamp to datetime
                            
                                Does CQL3 require a schema for Cassandra now?
                            
                                connecting to cassandra from PHP [closed]
                            
                                How to use Kafka Connect for Cassandra without Confluent
                            
                                Is there a schema versioning tool for cassandra [closed]
                            
                                Cassandra type error
                            
                                Cassandra-cli cant connect to remote cassandra server
                            
                                Counter Vs Int column in Cassandra?
                            
                                Disable colors in cqlsh
                            
                                Is it possible to use cql to query collections in a row?
                            
                                Inserting Analytic data from Spark to Postgres
                            
                                When are rows overwritten in cassandra
                            
                                Problems connecting to Cassandra pool from Spring application

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cassandra Wide Row/Dynamic Columns

Tags:

cassandra

cql

cql3

cassandra-3.0

dynamic-columns

user1888243

People also ask

1 Answers

Ashraful Islam

Recent Activity

Donate For Us