Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Misunderstanding on Composite Key for Cassandra

I've to test different datamodels for Cassandra. I'm thinking about to use a composite key made by key1:key2 for the row key. With this configuration on Cassandra, for example, I can query to have all the rows having a specific key1 value and any key2 value but It's impossible otherwise (obtain all the rows having a specific key2's value and any key1). Is it right?

thanks in advance

Cesare

like image 714
cesare Avatar asked Mar 26 '12 13:03

cesare


People also ask

What is composite key in Cassandra?

For a table with a composite partition key, Cassandra uses multiple columns as the partition key. These columns form logical sets inside a partition to facilitate retrieval. In contrast to a simple partition key, a composite partition key uses two or more columns to identify where data will reside.

What is primary key and clustering key in Cassandra?

A primary key uniquely identifies a row. A composite key is a key formed from multiple columns. A partition key is the primary lookup to find a set of rows, i.e. a partition. A clustering key is the part of the primary key that isn't the partition key (and defines the ordering within a partition).

What is a composite key in DBMS?

A composite key is the DBMS key having two or more attributes that together can uniquely identify a tuple in a table. Such a key is also known as Compound Key, where each attribute creating a key is a foreign key in its own right.


2 Answers

If you use Order Preserving Partitioning (OPP), then yes, the keys will be stored sorted, and then you can get slices over a range of keys e.g. A:A to A:Z -- but not necessarily any:A to any:Z.

But, OPP is not guaranteed to evenly distribute the keys across the nodes and you could end up with "hot spots" of too many or too few keys. You probably want to use Random Partitioning (RP) which distributes the keys by storing by hash across all nodes.

However, since Columns are stored sorted, using Composite values can be pretty powerful for accessing ranges of data.

See this question for details on querying Composite columns using Hector .

If necessary, the column names could then be used as keys to do Multiget queries for additional lookups.

like image 106
libjack Avatar answered Oct 20 '22 04:10

libjack


I hope these articles help you :)

http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/

http://www.datastax.com/docs/0.7/data_model/cfs_as_indexes

http://www.anuff.com/2011/02/indexing-in-cassandra.html

Also checkout this question

Storing a list of values in Cassandra

like image 26
Karthik S Avatar answered Oct 20 '22 04:10

Karthik S