Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra - advantages of custom type

I am planning to use a Java object as a custom type and store it Cassandra. I am taking out 2 data members from the class and making them into primary key and keeping the rest of the data members in the custom type. data members of my class: name, date_of_birth, occupation, last_visit, family_members, total_income primary key: name, date_of_birth cassandra custom type members: occupation, last_visit, family_members, total_income Will the custom data type have any performance benefits while writing or reading when compared to storing the individual data members in terms of Cassandra data types.

like image 850
summer Avatar asked Apr 15 '15 11:04

summer


People also ask

Which one is not a valid type of primary key in Cassandra?

Duration columns cannot be used in a table's PRIMARY KEY . This limitation is due to the fact that durations cannot be ordered.

What is a type in Cassandra?

Apache Cassandra supports the Cassandra Query Language, or CQL. Apache Cassandra Data Types are the classifications of data that indicate what type of data can be stored in a variable or object.

What is UDT in Cassandra?

User-Defined Types (UDTs) can be used to attach multiple data fields to a column. User-defined types (UDTs) can attach multiple data fields, each named and typed, to a single column. The fields used to create a UDT may be any valid data type, including collections and other existing UDTs.


1 Answers

Will the custom data type have any performance benefits while writing or reading when compared to storing the individual data members in terms of Cassandra data types.

Not really. Data for user defined types (UDTs) is stored in a single column in the row, and that should be a faster read than multiple individual columns. But whatever performance gain you achieve there will quickly be erased as the data is serialized for the result set. While CQL will allow you to read individual fields of the UDT if you desire, Cassandra still has to read all contents of that column regardless.

It is important to note that user defined types are not about improving performance. They're about offering the flexibility to achieve small amounts of denormalization.

And just a suggestion, but perhaps it makes more sense to have members as a collection, with each item containing data for each family member?

like image 150
Aaron Avatar answered Sep 30 '22 04:09

Aaron