Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why will timeuuid not have any collisions?

Tags:

cassandra

I was reading the Datastax CQL reference:

Collisions that would potentially overwrite data that was not intended to be overwritten cannot occur.

Can someone explain to me why a collision will never occur? Is it impossible or "highly" unlikely?

like image 463
cool breeze Avatar asked Mar 10 '23 18:03

cool breeze


2 Answers

Cassandra's timeuuid is a Version 1 UUID which is based on the time and the MAC address of the machine generating the UUID.

The time used is accurate down to 100ns, so the chance of a collision is incredibly slim (a nano second is a millionth of a millisecond).

like image 125
Samyel Avatar answered Mar 13 '23 07:03

Samyel


Cassandra timeuuid is a Version 1 UUID(Type 1 UUID) which is based on:

  1. A timestamp consisting of a count of 100-nanosecond intervals since 15 October 1582 (the date of Gregorian reform to the Christian calendar).
  2. A version (which should have a value of 1).
  3. A variant(which should have a value of 2).
  4. A sequence number, which can be a counter or a pseudo-random number.
  5. A "node" which will be the machine's MAC address (which should make the UUID unique across machines).

Using a pseudo-random number for the sequence number provides a 1 in a 16,384 chance that each UUID Class will have a unique id.

if you generate more than 10000 UUID per msec then they may collide.

1 msec = 10^6 ns

By this you can generate 10^6 UUID if we take ns level timestamp but as we take timestamp as 100ns count.

we will be have at most 10000 unique timestamps in one millisecond.

Now generating more than that on a single machine(which will have same MAC address), there is a chance to collide ass we also need to take sequence number into account.

If your application generates more than 10000 per ms, use another column to make a compound key which helps to avoid collisions.

like image 45
Samarendra Avatar answered Mar 13 '23 09:03

Samarendra