Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement a random float function so that it does not lose entropy? (PHP)

Tags:

php

random

math

I am trying to generate random floats using nothing but bytes I get from /dev/urandom. Currently my best idea is to get the platform precision doing something like:

$maximumPrecision = strlen('' . 1/3) - 2;

and then construct a string of 0-9 in a loop the number of times $maximumPrecision tells us. For examle, if the precision is 12, I would generate 12 random numbers and concatenate them. I think it's an ugly idea.

Update: Does this make sense?

$bytes =getRandomBytes(7); // Just a function that returns random bytes.
$bytes[6] = $bytes[6] & chr(15); // Get rid off the other half
$bytes .= chr(0); // Add a null byte

$parts = unpack('V2', $bytes);

$number = $parts[1] + pow(2.0, 32) * $parts[2];
$number /= pow(2.0, 52);
like image 776
Tower Avatar asked Sep 11 '10 15:09

Tower


2 Answers

PHP's float type typically is implemented as IEEE double. This format has 52 bit precision of mantissa, so in principle it should be able to generate 252 different uniform numbers in [0, 1).

Therefore, you could extract 52 bits from the /dev/urandom, interpret as an integer, and divide by 252. For example:

// assume we have 52 bits of data, little endian.
$bytes = "\x12\x34\x56\x78\x9a\xbc\x0d\x00";
//                                  ^   ^^ 12 bits of padding.

$parts = unpack('V2', $bytes);

$theNumber = $parts[1] + pow(2.0, 32) * $parts[2];  // <-- this is precise.
$theNumber /= pow(2.0, 52);                         // <-- this is precise.
like image 172
kennytm Avatar answered Oct 16 '22 20:10

kennytm


The problem here is that the IEEE double precision number is defined in terms of exponents of base 2:

n = 2^exponent * 1.mantissa

Since you want an exponent -1, and there's no integer number n such that 2^n = 0.1, it gets complicated.

This generates a number between 1 and 2. You can subtract 1, but you'll lose a tiny amount of entropy if you do that (KennyTM's answer yields a number in that range, though, and uses all entropy -- this answer attempts to create the representation directly):

$source = fopen("/dev/urandom", "rb");

//build big-endian double
//take only 32 bits because of 32-bit platforms
$byte_batch_1 = fread($source, 4); //32-bit
$byte_batch_2 = fread($source, 4); //32-bit, we only need 20

$offset = (1 << 10) -1;

$first_word = unpack("N", $byte_batch_2);
$first_word = reset($first_word);
$first_word &= 0xFFFFF; //leave only 20 lower bits
$first_word |= $offset << 20;

$str = pack("N", $first_word) . $byte_batch_1;

//convert to little endian if necessary
if (pack('s', 1) == "\x01\x00") { //little-endian
    $str = strrev($str);
}

$float = unpack("d", $str);
$float = reset($float);
like image 27
Artefacto Avatar answered Oct 16 '22 22:10

Artefacto