I would like to use a cryptographically secure primary key for sensitive data in a database - this cannot be guessable/predictable and it cannot be generated by the database (I need the key before the object is persisted).
I understand Java uses a type 4 UUID with a cryptographically secure random number generator, however I know the UUID isn't completely random so my question is how safe is it to assume that uuids cannot be predicted from a set of existing ones?
As the name implies, UUIDs should be for practical purposes unique and ideally hard to guess; although in certain scenarios – some of them which will be later discussed in this post – an attacker in possession of UUIDs that were previously generated by a system might be able to predict future ones.
UUID are supposed to be in-sequential, so that someone can not predict the other value. If you need sequence then UUID is not a right choice.
Well, the source code shows UUID. randomUUID uses SecureRandom . As you can see, you can use either, but in a secure UUID you have 6 non-random bits, which can be considered a disadvantage if you are picky.
Java on Linux uses the /dev/urandom generator, which can be rather slow if there is not much activity (e.g. user input or network traffic) on that system. The claim was 350ms to generate a single UUID. Our default systems are Macbook Pros running Mountain Lion. Production systems are the latest version of CentOS.
Well if you want to know how random a UUID is you have to look onto the source.
The following code section is taken from OpenJDK7 (and it is identical in OpenJDK6):
public static UUID randomUUID() { SecureRandom ng = numberGenerator; if (ng == null) { numberGenerator = ng = new SecureRandom(); } byte[] randomBytes = new byte[16]; ng.nextBytes(randomBytes); randomBytes[6] &= 0x0f; /* clear version */ randomBytes[6] |= 0x40; /* set to version 4 */ randomBytes[8] &= 0x3f; /* clear variant */ randomBytes[8] |= 0x80; /* set to IETF variant */ return new UUID(randomBytes); }
As you can see only 2 of 16 bytes are not completely random. In the sixth byte you lose 4 of 8 bits and on byte 8 you loose 2 bits of randomness.
Therefore you will get an 128 bit value with 122 bit randomness.
The only problem that may arise from the manipulation is that with a high chance your data can be identified as an UUID. Therefore if you want to hide it in other random data this will not work...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With