I'm working on a system that generates around 2 billion unique UUIDs a day. The UUIDs are generated using JavaScript \ Flash (AS3) on the client.
We recently noticed that our UUIDs are no where near unique. We have around 20% (!) of daily duplicates, most of which (relative to traffic volume) are coming from chrome.
I did some reading and learned that the pseudo-random generation (PRNG) algorithm implementation on most browsers, and specifically chrome, is flawed. Chromium and Node.js use the V8 javaScript engine, which implements an algorithm called MWC1616.
In theory, UUIDs generated using a good PRNG should have a 2132 probability for collision, but with MWC1616, under some very realistic scenarios, this probability is around 1:30000.
To solve the problem, I considered the following options:
Since I prefer to keep things on the client and I don't want to re-invent the wheel and modify the UUID creation logic, I want to stick with option 3.
The good news are that on newer browsers, there is the getRandomValues api. Unfortunately, I need to support older browsers.
So my questions are:
crypto.getRandomValues()
(that doesn't use Math.random internally)?
Does AS3 Math.random() use the browser's Math.random()? Does it implement the same algorithm itself?
Does flash.crypto.generateRandomBytes() use Math.random()? Does it use crypto.getRandomValues()? If not, which algorithm does it implement and will it be a good solution for the same issue in AS3? If not, which AS3 crypto library would you recommend?
P.S. I highly recommend the articles I mentioned -1- -2- -3-. I was aware of the issues with Math.random() for many years, but this article truly made it clear to me for good.
After spending more than a week researching this - my conclusion is: NEVER GENERATE UUIDs on the client. Just don't. Especially if you intend to scale.
For years, I knew that the browser's Math.random implementation was poor, but I didn't understand how bad was it, until we reached the scale of billions of events per day.
I decided to go with the easiest technical solution and moved UUID generation to the server. The percentage of duplicate IDs when down from ~25% a day to ~0.0008%.
P.S. Our server is implemented in Go. Node.js uses JavaScript V8 engine and might have the same issues. Though it seems like if you're using the latest Node.js, you should be ok.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With