Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which is more random: human- or software-generated number?

Does it toss a coin to get random bit?
Or throw a die to get a random integer from 1 to 6?
Or take a card from a shuffled deck to get a number from 1 to 52?
.
.
.
Or can it think like us or have intelligence like us?

Obviously the above examples can't be ways of generating random data.

Then how do software libraries generate random numbers on a given range? Which is more random: human- or software-generated?

like image 700
Javed Akram Avatar asked Dec 04 '22 21:12

Javed Akram


2 Answers

(Note: This is generally about random [and pseudo-random] numbers in computing and their uses.)

You can never have true random numbers with a deterministic process, which is why computers are fairly ill-suited to generate them (as the CPU can only flip bits in a deterministic manner). Most languages, frameworks and libraries use so-called Pseudo-random number generators (PRNG). Those take a seed which is sort of an initial state vector which can be a single number or an array of numbers and generate a sequence of seemingly-random values from there. The results usually satisfy certain statistical measures, but are not totally random, as the same seed will yield the exact same sequence.

One of the easiest PRNGs is the so-called Linear Congruential Generator (LCG). It just has a single number as state (which is initially the seed). Then for each successive return value the formula loops like this:

LCG formula: x_(n+1)=a⋅x_n+b (mod⁡c )

where a, b and c are constants for the generator. c is usually a power of two, such as 232 simply because it's easy to implement (done automatically) and fast. Finding good values for a and b is hard, however. As a most trivial example, when using a = 2 and b = 0, you can see that the resulting values can never be odd. This limits the range of values the generator can yield quite severely. Generally, LCGs are a very old concept and long supplanted by much better generators, so don't use them, except in extremely limited environments (although even embedded systems can handle better generators without problem, usually) – MT19937 or its generalization, the WELL generators are usually much better for people who simply don't want to worry about the properties of their pseudo-random numbers.

One major application of PRNGs is simulation. Since PRNGs can give an estimate or guarantee of statistical properties and an experiment can be repeated exactly due to the nature of the seed they do quite well here. Imagine you're publishing a paper and want other people to replicate your results. With a hardware RNG (se below) you have no other option than including every single random number you used. For Monte-Carlo simulations which can easily use a few billion numbers or more this is ... not quite feasible.

Then there are random number generators for cryptographic applications, e.g. for securing your SSL connection. Examples here are Windows' CryptGenRandom or Unix's /dev/urandom. Those often are also a PRNG, however they use a so-called “entropy pool” for seeding which contains unpredictable values. The main point here is to generate unpredictable sequences, even though the same seed will still yield the same sequence. To minimize the effect of an attacker guessing the sequence they need to be re-seeded regularly. The entropy pool is gathered from various points in the system: events, such as input, network activity, etc. Sometimes it's initialized also with memory locations throughout the system assumed to contain garbage. However if it's done, care must be taken to ensure that the entropy pool really contains unpredictable stuff. Something that Debian got wrong in OpenSSL a few years ago.

You can also use the entropy pool directly to get random numbers, (e.g. Linux's /dev/random; FreeBSD instead uses the same algorithm for /dev/random as for /dev/urandom), but you don't get too much out of it and once it's empty it takes a while to replenish. That's why the algorithms mentioned above are usually used to extend what little entropy there is to larger volumes.

Then there are hardware-based random number generators, which use unpredictable natural processes, such as radioactive decay or electrical noise in a wire. Those are for the most demanding of applications requiring many “truly” random numbers and are able to generate a few hundred MiB of randomness per second, usually (ok, that data point is a few years old, but I doubt it can be done much faster by now).

You can emulate such things by writing a program taking images from a webcam with lens cap on (only noise remains, then) or from audio input when no actual input is present. Those are fine for a little hacking, but usually will not generate good random numbers, as they are biased, i.e. in the bit stream zeroes and ones are not represented with the same frequency (or, going further, the sequences 00, 01, 10 and 11 are not generated with the same frequency ... you can do this for larger sequences as well). So a part of an actual hardware RNG goes into making sure that the resulting values satisfy certain statistical distribution properties.

Some people actually throw dice to get random dice rolls or even take this into overdrive. And humans make very bad random number generators.

like image 55
Joey Avatar answered Jan 29 '23 03:01

Joey


A compiler does absolutely nothing with regards to requiring a random number. A compiler merely makes your code, call some other code which returns a random number. Now the "other code" which calls the random number might be:

  1. Implemented within your code.
  2. Implemented as a part of a standard library.
  3. Implemented as a part of a special library.

In cases (1) and (2), it is mostly a pseudo-random algorithm, and the numbers you get aren't really random. If it is part of a standard library (like cmath or math.h), then one cannot say for sure that the result values are pseudo-random because the standards specify only the definition and not implementation.

EDIT: The library is stdlib.h and it is ALWAYS a psuedo-random number as pointed out by Joey and phresnel. Read comments for details on their answers. I apologize for the error and I agree that I should have known better than to reply on instinct.

Special libraries may be used which may have special implementation of other algorithms, like the Mersenne Twister algorithm. Also, they may be nothing more than drivers of Hardware that can generate random numbers. Hardware random number generators return somewhat "true" random numbers http://en.wikipedia.org/wiki/Hardware_random_number_generator.

Random number algorithms from standard libraries eventually map to system calls on the OS. So, on linux, for example, you may simply be reading from /dev/random or /dev/urandom (or you may be doing the same thing in your own code as well).

Also note that, true randomness can be achieved without using dedicated hardware or some dedicated service. /dev/random and /dev/urandom provide random numbers, which for all intents and purposes can be considered true.

EDIT: Some special libraries or your own code may even be using a network service for random numbers (many of which provide true random numbers).

like image 38
Rohan Prabhu Avatar answered Jan 29 '23 05:01

Rohan Prabhu