Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate a repeatable random number sequence?

I would like a function that can generate a pseudo-random sequence of values, but for that sequence to be repeatable every run. The data I want has to be reasonably well randomly distributed over a given range, it doesn't have to be perfect.

I want to write some code which will have performance tests run on it, based on random data. I would like that data to be the same for every test run, on every machine, but I don't want to have to ship the random data with the tests for storage reasons (it might end up being many megabytes).

The library for the random module doesn't appear to say that the same seed will always give the same sequence on any machine.

EDIT: If you're going to suggest I seed the data (as I said above), please provide the documentation that says the approach valid, and will work on a range of machines/implementations.

EDIT: CPython 2.7.1 and PyPy 1.7 on Mac OS X and CPython 2.7.1 and CPython 2.52=.2 Ubuntu appear to give the same results. Still, no docs that stipulate this in black and white.

Any ideas?

like image 550
Joe Avatar asked Jan 26 '12 18:01

Joe


People also ask

Can you generate the same random numbers everytime?

random seed() example to generate the same random number every time. If you want to generate the same number every time, you need to pass the same seed value before calling any other random module function. Let's see how to set seed in Python pseudo-random number generator.

How do you repeat random numbers in Python?

If you start from the same place in the series twice, then you get the exact same "random" numbers. The way to set this beginning in the random module of python is to call the random. seed() function and give it an arbitrary number. e.g. 42 would be perfect.


2 Answers

For this purpose, I've used a repeating MD5 hash, since the intention of a hashing function is a cross-platform one-to-one transformation, so it will always be the same on different platforms.

import md5

def repeatable_random(seed):
    hash = seed
    while True:
        hash = md5.md5(hash).digest()
        for c in hash:
            yield ord(c)

def test():
    for i, v in zip(range(100), repeatable_random("SEED_GOES_HERE")):
        print v

Output:

184 207 76 134 103 171 90 41 12 142 167 107 84 89 149 131 142 43 241 211 224 157 47 59 34 233 41 219 73 37 251 194 15 253 75 145 96 80 39 179 249 202 159 83 209 225 250 7 69 218 6 118 30 4 223 205 91 10 122 203 150 202 99 38 192 105 76 100 117 19 25 131 17 60 251 77 246 242 80 163 13 138 36 213 200 135 216 173 92 32 9 122 53 250 80 128 6 139 49 94

Essentially, the code will take your seed (any valid string) and repeatedly hash it, thus generating integers from 0 to 255.

like image 73
DrRobotNinja Avatar answered Sep 16 '22 14:09

DrRobotNinja


There are platform differences, so if you move your code between different platforms I would go for the method that DrRobotNinja described.

Please take a look at the following example. Python on my desktop machine (64-bit Ubuntu with a Core i7, Python 2.7.3) gives me the following:

> import random
> r = random.Random()
> r.seed("test")
> r.randint(1,100)
18

But if I run the same code on my Raspberry Pi (Raspbian on ARM11), I get a a different result (for the same version of Python)

> import random
> r = random.Random()
> r.seed("test")
> r.randint(1,100)
34
like image 39
Joppe Avatar answered Sep 19 '22 14:09

Joppe