Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Uniform real distributions near zero

Is there ever any call for a floating-point approximation of a continuous uniform distribution contrasted with (what appears to be more popular) a discrete uniform distribution?

To produce an arbitrary-precision random value quantised to a floating-point type, I'd expect something along the lines of:

double rand0to1(void)
{
    int exp = -53;
    while (random_bit() == 0) exp--;
    return ldexp((double)((1L << 52) | random_52bits()), exp);
}

What appears to be common is:

double rand0to1(void)            
{
    return ldexp((double)random_53bits(), -53);
}

Obviously the former being an approximation of something impossible to achieve is a big black mark for it, but I wonder if there are cases where the guarantee that the mantissa will always be fully randomised becomes useful if the result happens to be small.

If I were implementing my own general-purpose uniform real random number generator library, what harm might I do by deviating from convention and keeping the mantissa fully randomised for small values?

My best guess is that after subsequent arithmetic, the extra precision might force a rounding condition which would bias the low-order bits. However, my intuition is that this would usually happen for arithmetic on discrete distributions as well.

like image 439
sh1 Avatar asked Nov 01 '22 02:11

sh1


1 Answers

The main difference is that your first definition—which is not quite the right thing, but close—is supported on 𝔽 ∩ [0,1), whereas your second definition is supported only on 𝔽 ∩ ({0} ∪ [2⁻⁵³, 1)).

Your first definition will return zero with probability around 2⁻¹⁰⁷⁵, which is the correct Lebesgue measure of real numbers that are rounded to zero.

Your second definition, in contrast, omits all floating-point numbers in (0, 2⁻⁵³), and returns 0 with probability 2⁻⁵³.

Why might this matter?

Suppose you want to take the logarithm of the result (for example, in an exponential or Laplace sampler), or compute any other function with an essential singularity at zero.

  • This is safe with your first definition, with no rejection sampling: a probability of 2⁻¹⁰⁷⁵ is so minute that even cryptographers consider it negligible. You are guaranteed never to trip over division by zero or handle infinities unless your random bit generator is severely broken.

  • But while you're not likely to trip over division by zero and yield −∞ during testing with the second definition, a probability of 2⁻⁵³ is not negligible—the Bitcoin network trips over an event with probability 2⁻⁵³ many times a second in its insatiable quest to burn energy for useless random math puzzle solutions. To use the second definition safely you must do rejection sampling on the outputs to avoid zero, even though the probability of zero in a true uniform distribution on [0,1) rounded to floating-point numbers is beyond negligible.

Similarly, the true uniform distribution on [0,1) rounded to a floating-point number can also yield 1. By omitting 1 from the support, you exclude a small, but not negligible, fraction of [0,1), and effectively sample at best from [0,1 − 2⁻⁵⁴) instead of [0,1).

But there's seldom any reason to omit 1 anyway; for example, if you were going to work with log1p(𝑈) where 𝑈 ∼ [0,1) is uniform, you can get exactly the same distribution by working with log(𝑈) where 𝑈 ∼ (0,1], which makes much more effective use of the floating-point space.

Not only does it make more effective use of the floating-point space, but it may make the difference between broken and secure differential privacy (although you may also need a correctly rounded logarithm, not just any old libm).

(What if I want an integer sampler on [0,𝑛), you ask? You already have a uniform bit sampler; given that, you are better off just doing rejection sampling on ⌈lg 𝑛⌉-bit strings than taking a circuitous detour through floating-point.)

So what's the right thing? To write a [0,1] sampler (or (0,1] sampler, by calling the event of 0 in [0,1] with probability 2⁻¹⁰⁷⁵ a won't-happen bug), draw the exponent with geometric distribution as you did, and then draw the significand with uniform distribution on more than 53 bits—and unconditionally set the least significant bit.

The least significant bit serves as a kind of sticky bit: in the true uniform distribution on real numbers, the subset with a finite 53-bit binary expansion followed by all 0 bits has measure zero, so there is “almost always” a 1 bit, which this “sticky bit” represents to break ties. This gives the correct weight to every floating-point number in [0,1].

like image 164
Floating Pundit Avatar answered Nov 10 '22 04:11

Floating Pundit