Two simple questions: <ul> <li>Is a UUID a good choice as a partition key? Will this distribute data evenly among all nodes in the cluster?</li> <li>Is a (unique) integer a good choice?</li> </ul> Will any of these options create "hot" partitions? Thanks!

UUID is a good choice for partition key - it should be good distributed between cluster nodes. "Unique" integer is more tricky - some node need to be an authority for generation of this number, and this is hard to do in the distributed environment. Regarding hot partition - this will depend on your data model. If you have other primary key components besides the partition key, yes - you may have this problem. For example, you generate a random UUID for sensor & starting to write a lot of data into it.

Is UUID or Integer a good choice as partition key?

1 Answers

UUID is a good choice for partition key - it should be good distributed between cluster nodes. "Unique" integer is more tricky - some node need to be an authority for generation of this number, and this is hard to do in the distributed environment.

Regarding hot partition - this will depend on your data model. If you have other primary key components besides the partition key, yes - you may have this problem. For example, you generate a random UUID for sensor & starting to write a lot of data into it.

answered Sep 21 '22 21:09

Alex Ott

Related questions
                            
                                Cassandra "default_time_to_live" property is not deleting data
                            
                                Scala - Cassandra: cluster read fails with error "Can't use this Cluster instance because it was previously closed"
                            
                                Spark Cassandra Connector keyBy and shuffling
                            
                                cassandra.InvalidRequest: code=2200 [Invalid query] message="Keyspace '' does not exist"
                            
                                How to get good performance on reading cassandra partitions in spark?
                            
                                Discrepancy between Cassandra trace and client-side latency
                            
                                cassandra 2.1 -> 3.0 upgrade restrictions
                            
                                Are dummy partition keys always bad?
                            
                                Spring boot + cassandra
                            
                                How to effectively read millions of rows from Cassandra?
                            
                                Update denormalized data in Cassandra
                            
                                Cassandra: Argument types do not match
                            
                                Spark: PySpark + Cassandra query performance
                            
                                High Native-Transport-Requests All time Blocked
                            
                                What is DML in Apache Cassandra?
                            
                                Cassandra Could not initialize class com.sun.jna.Native
                            
                                Spring Data Cassandra: "No property findAll for type User"
                            
                                Is an update in Cassandra not an anti pattern?
                            
                                Cassandra CQL support for CONTAINS negation
                            
                                Saving a DateTime to Cassandra Date column

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is UUID or Integer a good choice as partition key?

Tags:

data-modeling

cassandra

cql

Alex Tbk

People also ask

1 Answers

Alex Ott

Recent Activity

Donate For Us