I'm working on a distributed data base. I'm trying to generate a unique ID that will serve as a column family primary key in cassandra. I read some articles about doing this with Java using <code>UUID</code> but it seems like there is a probability for collision (even if it's very low). I wonder if there is a way to generate a unique ID based on time maybe?

You need to use cassandra function <code>now()</code> to generate timeuuid and use <code>uuid()</code> function to generate uuid type string.

Cassandra: Generate a unique ID?

4 Answers

You can use the TimeUUID type in Cassandra, which backs a Type 1 UUID. This uses the current time and the creator's MAC address and a sequence number. If the TimeUUID number is generated correctly this can be done with zero collisions (you can use the CQL now() method or insert your own, the java SDK's provide some thread-safe implementations). The main advantage of TimeUUIDs is that the IDs can be time ordered. See http://wiki.apache.org/cassandra/TimeBaseUUIDNotes for more info.

However, the time ordering is unlikely to be useful for row primary keys, since the ordering is useless when using a hash partitioner, though possible using a clustering key. And also the complexity of generating a unique ID could be a source of bugs if you roll your own. Cassandra also supports Type 4 UUIDs by using the UUID type. These are just random bits. There is a collision probability, but the collision probability (assuming uncorrelated random number sources, which it will be if you generate in Java) is extremely low - if you created 1 billion a second for 100 years the probability of one collision is about 50%. (See http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates for more details.)

120

answered Oct 19 '22 15:10

Richard

You should investigate using Twitter Snowflake. From the project readme:

As we at Twitter move away from Mysql towards Cassandra, we've needed a new way to generate id numbers. There is no sequential id generation facility in Cassandra, nor should there be.

Snowflake uses an intuitive algorithm that generates longs which are both time-ordered and unique. Since your database is distributed, this service should suit your needs well.

answered Oct 19 '22 15:10

Related questions
                            
                                Get current date in cassandra cql select
                            
                                Clustering Keys in Cassandra
                            
                                com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured table schema_keyspaces
                            
                                cqlsh connection error: 'ref() does not take keyword arguments'
                            
                                Apache Cassandra remote access
                            
                                Apache Cassandra vs Datastax Cassandra [closed]
                            
                                What is the batch limit in Cassandra?
                            
                                Cassandra: can I have default value for a column like sql
                            
                                Spatial data with mongodb or cassandra
                            
                                How to rename table in Cassandra CQL3
                            
                                Error while connecting to Cassandra using Java Driver for Apache Cassandra 1.0 from com.example.cassandra
                            
                                difference between exactly-once and at-least-once guarantees
                            
                                problem on starting cassandra
                            
                                Why are super columns in Cassandra no longer favoured?
                            
                                How to load Spark Cassandra Connector in the shell?
                            
                                Primary key in cassandra is unique?
                            
                                What are the implications of R + W > N for Cassandra clusters?
                            
                                Executing CQL through Shell Script?
                            
                                Cassandra "no viable alternative at input"
                            
                                Why don't you start off with a "single & small" Cassandra server as you usually do it with MySQL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cassandra: Generate a unique ID?

Tags:

uuid

cassandra

cql

cql3

user2090879

People also ask

4 Answers

Richard

noahlz

abhi

Ajai

Recent Activity

Donate For Us