Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating random string of seedable data

Tags:

python

random

I'm looking for a way to generate a random string of n bytes in Python in a similar way to os.urandom() method except providing a way to seed the data generation.

So far I have:

def genRandData(size):
    buf = chr(random.randint(0,255))
    for i in range(size-1):
        buf = buf + chr(random.randint(0,255))
    return str(buf)

However this function is very slow, generating a megabyte of data takes about 1.8 seconds on my machine. Is there any way of improving this (or alternatively a way to seed os.urandom).

like image 783
Will Avatar asked Sep 01 '15 10:09

Will


People also ask

How do you generate a random string from an alphabet string?

Initialize an empty string and name it as “ran”. Choose the size of the string to be generated. Now using Next () method generate a random number and select the character at that index in the alphabet string. Append that character to randomString. Repeat steps 4 and 5 for n time where n is the length of the string.

What is the secret behind the random function seed?

The secret is to use explicit seeds for the random function, so that when the test is run again with the same seed, it produces again exactly the same strings. Here is a simplified example of a function that generates object names in a reproducible manner:

How to generate random strings in Node JS?

This article introduces to you an easy way to generate random strings in Node.js using the randomBytes API provided by the crypto module (a built-in module and no installation required). The output will look like this: Keep in mind that the output contains random strings, so it will be different each time you execute your code.

Is it possible to make random data random but reproducible?

Yeah posted my question here stackoverflow.com/questions/58854667/… When generating random data, specially for test, it is very useful to make the data random, but reproducible. The secret is to use explicit seeds for the random function, so that when the test is run again with the same seed, it produces again exactly the same strings.


1 Answers

If you have numpy available, it has a version of the random module as numpy.random that contains this function that you might consider:

numpy.random.bytes(length)

It is very fast:

$ python -mtimeit "import numpy" "numpy.random.bytes(1<<30)"
10 loops, best of 3: 2.19 sec per loop

That's for 1GiB.

And you can seed it with numpy.random.seed.

like image 179
Dan D. Avatar answered Nov 08 '22 03:11

Dan D.