Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra column key auto increment

I am trying to understand Cassandra and how to structure my column families (CF) but it's quite hard since I am used to relational databases.

For example if I create simple users CF and I try to insert new row, how can I make an incremental key like in MySQL?

I saw a lot of examples where you would just put the username instead of unique ID and that would make a little sense, but what if I want users to have duplicated usernames?

Also how can I make searches when from what I understand cassandra does not suport > operators, so something like select * from users where something > something2 would not work.

And probably the most important question what about grouping? Would I need to retrieve all data and then filter it with whatever language I am using? I think that would slow down my system a lot.

So basically I need some brief explanation how to get started with Cassanda.

like image 901
Linas Avatar asked Oct 03 '12 13:10

Linas


2 Answers

Your questions are quite general, but let me take a stab at it. First, you need to model your data in terms of your queries. With an RDBMS, you model your data in some normalized form, then optimize later for your specific queries. You cannot do this with Cassandra; you must write your data the way you intend to read it. Often this means writing it more than one way. In general, it helps to completely shed your RDBMS thinking if you want to work effectively with Cassandra.

Regarding keys:

  • They are used in Cassandra as the unit of distribution across the ring. So your key will get hashed and assigned an "owner" in the ring. Use the RandomPartitioner to guarantee even distribution

  • Presuming you use RandomPartitioner (you should), keys are not sorted. This means you cannot ask for a range of keys. You can, however, ask for a list of keys in a single query.

  • Keys are relevant in some models and not in others. If your model requires query-by-key, you can use any unique value that your application is aware of (such as a UUID). Sometimes keys are sentinel values, such as a Unix epoch representing the start of the day. This allows you to hand Cassandra a bunch of known keys, then get a range of data sorted by column (see below).

Regarding query predicates:

  • You can get ranges of data presuming you model it correctly to answer your queries.

  • Since columns are written in sorted order, you can query a range from column A to column n with a slice query (which is very fast). You can also use composite columns to abstract this mechanism a bit.

  • You can use secondary indexes on columns where you have low cardinality--this gives you query-by-value functionality.

  • You can create your own indexes where the data is sorted the way you need it.

Regarding grouping:

I presume you're referring to creating aggregates. If you need your data in real-time, you'll want to use some external mechanism (like Storm) to track data and constantly update your relevant aggregates into a CF. If you are creating aggregates as part of a batch process, Cassandra has excellent integration with Hadoop, allowing you to write map/reduce jobs in Pig, Hive, or directly in your language of choice.

like image 184
rs_atl Avatar answered Sep 30 '22 00:09

rs_atl


To your first question:

can i make incremental key like in mysql

No, not really -- not native to Cassandra. How to create auto increment IDs in Cassandra -- You could check here for more information: http://srinathsview.blogspot.ch/2012/04/generating-distributed-sequence-number.html

Your second question is more about how you store and model your Cassandra data.

Check out stackoverflow's search option. Lots of interesting questions!

  1. Switching from MySQL to Cassandra - Pros/Cons?
  2. Cassandra Data Model
  3. Cassandra/NoSQL newbie: the right way to model?
  4. Apache Cassandra schema design
  5. Knowledge sources for Apache Cassandra

Most importantly, When NOT to use Cassandra?

like image 23
sdolgy Avatar answered Sep 30 '22 01:09

sdolgy