Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate large number of unique random float32 numbers

I need to generate a binary file containing only unique random numbers, with single precision. The purpose is then to calculate the entropy of this file and use it with other datasets entropy to calculate a ratio entropy_file/entropy_randUnique. This value is named "randomness".

I can do this in python with double-precision numbers and inserting them into set(), using struct.pack like so:

    numbers = set()
    while len(numbers) < size:
        numbers.add(struct.pack(precision,random.random()))
    for num in numbers:
        file.write(num)

but when I change to single precision I can't just change the pack method (that will produce a lot of the same numbers and the while will never end), and I can't generate single precision numbers with random. I've looked into numpy but the generator works the same way from what I understood. How can I get 370914252 (this is my biggest test case) unique float32 inside a binary file, even if they're not random, I think that a shuffled sequence would suffice..

like image 692
SamGamgee Avatar asked Nov 20 '13 16:11

SamGamgee


People also ask

How do you generate multiple random floats in Python?

Use a random. random() function of a random module to generate a random float number uniformly in the semi-open range [0.0, 1.0) . Note: A random() function can only provide float numbers between 0.1. to 1.0. Us uniform() method to generate a random float number between any two numbers.

How do you generate a random float value?

In order to generate Random float type numbers in Java, we use the nextFloat() method of the java. util. Random class. This returns the next random float value between 0.0 (inclusive) and 1.0 (exclusive) from the random generator sequence.

How do you generate a range of random numbers in Python?

Use a random. randint() function to get a random integer number from the inclusive range. For example, random. randint(0, 10) will return a random number from [0, 1, 2, 3, 4, 5, 6, 7, 8 ,9, 10].

Which function random module is used to generate a random float number between 0 and 1?

Almost all module functions depend on the basic function random() , which generates a random float uniformly in the semi-open range [0.0, 1.0).


1 Answers

Your best bet is to generate random 32-bit integers then convert them to floating point. You'll need to reject bit representations of infinity and NAN as you generate the numbers.

You can generate your set from the integer values rather than the floating point ones, then do the conversion on output. Rather than using a set, you could use a bit map to detect which integer values have already been used; that's more likely to fit in memory, especially given the largest sample size you indicate.

def random_unique_floats(n):
    used = bytearray(0 for i in xrange(2**32 // 8))
    count = 0
    while count < n:
        bits = random.getrandbits(32)
        value = struct.unpack('f', struct.pack('I', bits))[0]
        if not math.isinf(value) and not math.isnan(value):
            index = bits // 8
            mask = 0x01 << (bits & 0x07)
            if used[index] & mask == 0:
                yield value
                used[index] |= mask
                count += 1

for num in random_unique_floats(size):
    file.write(struct.pack('f', num))

Note that as your number of samples approaches the number of possible floating-point values, the run time will go up exponentially.

like image 172
Mark Ransom Avatar answered Sep 21 '22 15:09

Mark Ransom