Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate very very large random numbers

How would you generate a very very large random number? I am thinking on the order of 2^10^9 (one billion bits). Any programming language -- I assume the solution would translate to other languages.

I would like a uniform distribution on [1,N].

My initial thoughts:

--You could randomly generate each digit and concatenate. Problem: even very good pseudorandom generators are likely to develop patterns with millions of digits, right?

  • You could perhaps help create large random numbers by raising random numbers to random exponents. Problem: you must make the math work so that the resulting number is still random, and you should be able to compute it in a reasonable amount of time (say, an hour).

  • If it helps, you could try to generate a possibly non-uniform distribution on a possibly smaller range (using the real numbers, for instance) and transform. Problem: this might be equally difficult.

Any ideas?

like image 543
usul Avatar asked Mar 27 '11 07:03

usul


Video Answer


2 Answers

Generate log2(N) random bits to get a number M, where M may be up to twice as large as N. Repeat until M is in the range [1;N].

Now to generate the random bits you could either use a source of true randomness, which is expensive.

Or you might use some cryptographically secure random number generator, for example AES with a random key encrypting a counter for subsequent blocks of bits. The cryptographically secure implies that there can be no noticeable patterns.

like image 180
starblue Avatar answered Dec 15 '22 13:12

starblue


It depends on what you need the data for. For most purposes, a PRNG is fast and simple. But they are not perfect. For instance I remember hearing that Monte Carlos simulations of chaotic systems are really good at revealing the underlying pattern in a PRNG.

If that is the sort of thing that you are doing, though, there is a simple trick I learned in grad school for generating lots of random data. Take a large (preferably rapidly changing) file. (Some big data structures from the running kernel are good.) Compress it to increase the entropy. Throw away the headers. Then for good measure, encrypt the result. If you're planning to use this for cryptographic purposes (and you didn't have a perfect entropy data set to work with), then reverse it and encrypt again.

The underlying theory is simple. Information theory tells us that there is no difference between a signal with no redundancy and pure random data. So if we pick a big file (ie lots of signal), remove redundancy with compression, and strip the headers, we have a pretty good random signal. Encryption does a really good job at removing artifacts. However encryption algorithms tend to work forward in blocks. So if someone could, despite everything, guess what was happening at the start of the file, that data is more easily guessable. But then reversing the file and encrypting again means that they would need to know the whole file, and our encryption, to find any pattern in the data.

The reason to pick a rapidly changing piece of data is that if you run out of data and want to generate more, you can go back to the same source again. Even small changes will, after that process, turn into an essentially uncorrelated random data set.

like image 39
btilly Avatar answered Dec 15 '22 13:12

btilly