Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get random double (floating point) value from random byte array between 0 and 1 in C#?

Tags:

c#

random

Assume I have an array of bytes which are truly random (e.g. captured from an entropy source).

byte[] myTrulyRandomBytes = MyEntropyHardwareEngine.GetBytes(8);

Now, I want to get a random double precision floating point value, but between the values of 0 and positive 1 (like the Random.NextDouble() function performs).

Simply passing an array of 8 random bytes into BitConverter.ToDouble() can yield strange results, but most importantly, the results will almost never be less than 1.

I am fine with bit-manipulation, but the formatting of floating point numbers has always been mysterious to me. I tried many combinations of bits to apply randomness to and always ended up finding the numbers were either just over 1, always VERY close to 0, or very large.

Can someone explain which bits should be made random in a double in order to make it random within the range 0 and 1?

like image 581
fdmillion Avatar asked Apr 21 '17 23:04

fdmillion


2 Answers

Though working answers have been given, I'll give an other one, that looks worse but isn't:

long asLong = BitConverter.ToInt64(myTrulyRandomBytes, 0);
double number = (double)(asLong & long.MaxValue) / long.MaxValue;

The issue with casting from an ulong to double is that it's not directly supported by hardware, so it compiles to this:

 vxorps      xmm0,xmm0,xmm0 
 vcvtsi2sd   xmm0,xmm0,rcx   ; interpret ulong as long and convert it to double
 test        rcx,rcx         ; add fixup if it was "negative"
 jge         000000000000001D 
 vaddsd      xmm0,xmm0,mmword ptr [00000060h] 
 vdivsd      xmm0,xmm0,mmword ptr [00000068h] 

Whereas with my suggestion it will compile more nicely:

 vxorps      xmm0,xmm0,xmm0 
 vcvtsi2sd   xmm0,xmm0,rcx 
 vdivsd      xmm0,xmm0,mmword ptr [00000060h] 

Both tested with the x64 JIT in .NET 4, but this applies in general, there just isn't a nice way to convert an ulong to a double.

Don't worry about the bit of entropy being lost: there are only 262 doubles between 0.0 and 1.0 in the first place, and most of the smaller doubles cannot be chosen so the number of possible results is even less.

Note that this as well as the presented ulong examples can result in exactly 1.0 and distribute the values with slightly differing gaps between adjacent results because they don't divide by a power of two. You can change them exclude 1.0 and get a slightly more uniform spacing (but see the first plot below, there is a bunch of different gaps, but this way it is very regular) like this:

long asLong = BitConverter.ToInt64(myTrulyRandomBytes, 0);
double number = (double)(asLong & long.MaxValue) / ((double)long.MaxValue + 1);

As a really nice bonus, you can now change the division to a multiplication (powers of two usually have inverses)

long asLong = BitConverter.ToInt64(myTrulyRandomBytes, 0);
double number = (double)(asLong & long.MaxValue) * 1.08420217248550443400745280086994171142578125E-19;

Same idea for ulong, if you really want to use that.


Since you also seemed interested specifically in how to do it with double-bits trickery, I can show that too.

Because of the whole significand/exponent deal, it can't really be done in a super direct way (just reinterpreting the bits and that's it), mainly because choosing the exponent uniformly spells trouble (with a uniform exponent, the numbers are necessarily clumped preferentially near 0 since most exponents are there).

But if the exponent is fixed, it's easy to make a double that's uniform in that region. That cannot be 0 to 1 because that spans a lot of exponents, but it can be 1 to 2 and then we can subtract 1.

So first mask away the bits that won't be part of the significand:

x &= (1L << 52) - 1;

Put in the exponent (1.0 - 2.0 range, excluding 2)

x |= 0x3ff0000000000000;

Reinterpret and adjust for the offset of 1:

return BitConverter.Int64BitsToDouble(x) - 1;

Should be pretty fast, too. An unfortunate side effect is that this time it really does cost a bit of entropy, because there are only 52 but there could have been 53. This way always leaves the least significant bit zero (the implicit bit steals a bit).


There were some concerns about the distributions, which I will address now.

The approach of choosing a random (u)long and dividing it by the maximum value clearly has a uniformly chosen (u)long, and what happens after that is actually interesting. The result can justifiably be called a uniform distribution, but if you look at it as a discrete distribution (which it actually is) it looks (qualitatively) like this: (all examples for minifloats)

float distribution

Ignore the "thicker" lines and wider gaps, that's just the histogram being funny. These plots used division by a power of two, so there is no spacing problem in reality, it's only plotted strangely.

Top is what happens when you use too many bits, as happens when dividing a complete (u)long by its max value. This gives the lower floats a better resolution, but lots of different (u)longs get mapped onto the same float in the higher regions. That's not necessarily a bad thing, if you "zoom out" the density is the same everywhere.

The bottom is what happens when the resolution is limited to the worst case (0.5 to 1.0 region) everywhere, which you can do by limiting the number of bits first and then doing the "scale the integer" deal. My second suggesting with the bit hacks does not achieve this, it's limited to half that resolution.

For what it's worth, NextDouble in System.Random scales a non-negative int into the 0.0 .. 1.0 range. The resolution of that is obviously a lot lower than it could be. It also uses an int that cannot be int.MaxValue and therefore scales by approximately 1/(231-1) (cannot be represented by a double, so slightly rounded), so there are actually 33 slightly different gaps between adjacent possible results, though the majority of the gaps is the same distance.

Since int.MaxValue is small compared to what can be brute-forced these days, you can easily generate all possible results of NextDouble and examine them, for example I ran this:

const double scale = 4.6566128752458E-10;
double prev = 0;
Dictionary<long, int> hist = new Dictionary<long, int>();
for (int i = 0; i < int.MaxValue; i++)
{
    long bits = BitConverter.DoubleToInt64Bits(i * scale - prev);
    if (!hist.ContainsKey(bits))
        hist[bits] = 1;
    else
        hist[bits]++;
    prev = i * scale;
    if ((i & 0xFFFFFF) == 0)
        Console.WriteLine("{0:0.00}%", 100.0 * i / int.MaxValue);
}
like image 56
harold Avatar answered Sep 22 '22 12:09

harold


This is easier than you think; its all about scaling (also true when going from a 0-1 range to some other range).

Basically, if you know that you have 64 truly random bits (8 bytes) then just do this:

double zeroToOneDouble = (double)(BitConverter.ToUInt64(bytes) / (decimal)ulong.MaxValue);

The trouble with this kind of algorithm comes when your "random" bits aren't actually uniformally random. That's when you need a specialized algorithm, such as a Mersenne Twister.

like image 31
BradleyDotNET Avatar answered Sep 19 '22 12:09

BradleyDotNET