Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating random boolean

I'm currently implementing Eller's Algorithm in C++ and a small detail is bothering me about the randomness of the maze.

Until now I used the following code to generate a random bool:

bool randomBool()
{
    return 0 + (rand() % (1 - 0 + 1)) == 1;
}

// In main.cpp

time_t seconds;
time(&seconds);
srand((unsigned int) seconds);

But upon debugging I often see repeated true or false being generated, sometimes up to 30 times in a row.

Is this algorithm truly random or is there any better method in C++?

like image 850
Null User Avatar asked Apr 10 '17 17:04

Null User


People also ask

How do you generate a random Boolean?

In order to generate Random boolean in Java, we use the nextBoolean() method of the java. util. Random class. This returns the next random boolean value from the random generator sequence.

How do you generate a random Boolean in C++?

Until now I used the following code to generate a random bool : bool randomBool() { return 0 + (rand() % (1 - 0 + 1)) == 1; } // In main. cpp time_t seconds; time(&seconds); srand((unsigned int) seconds);

How do you randomize a boolean in C#?

If you want to create a random boolean, use this code: var random = new Random(); var randomBool = random.

How do you randomly pick true or false in Python?

The random. random() function generated a random floating number between 0 and 1. The generated number is compared with 0 by using the if() function. If the generated number is greater than 0, the used method will return True as output, else it will return False.


2 Answers

The STL in C++11 has build in random number generation methods that are superior to rand(). You can simulate a random boolean through a random integer that is 0 or 1:

#include <iostream>
#include <random>

int main(int argc, char *argv[]) {
    auto gen = std::bind(std::uniform_int_distribution<>(0,1),std::default_random_engine());
    const unsigned int N = 100;
    unsigned int numTrue = 0;
    unsigned int numFalse = 0;
    for (int i = 0; i < 100; ++i) {
        bool b = gen();
        if (b) ++ numTrue;
        else ++numFalse;
    }
    std::cout << numTrue << " TRUE, " << numFalse << " FALSE" << std::endl;
}

You can find more details on this library in standard C++ references. For instance, if you wanted something other than a 50/50 ratio of "true" and "false" values, you could create a random floating point number between 0 and 1 and call values less than some threshold z true, otherwise false.

Why you see long streaks, I think

I haven't addressed why you get 30 values of "true" or "false" in a row with your code. Though rand() should no longer be used, and you seem to have some unnecessary addition and subtraction of ones and zeroes in your code, there shouldn't be such a problem. However, I realize now the text in your question is ambiguous. If you are running and exiting your program 30 times in a row, you should expect to see repeated values -- even with my code. Most random number generators are really pseudorandom number generators. Each time you run the program, they will produce the same sequence of random numbers; this is important for consistency of results. However, while the program is running (e.g. putting your randomBool() in a loop), you should not see streaks of such a length, as they would be highly improbable.

Improbability of long streaks

I was surprised to receive comments disagreeing with my assertion that a streak of 30 "true" or "false" random booleans is improbable (when true or false are equally likely). I realize that a common misunderstanding of probability is that "luck" tries to even things out, and so that if a coin toss has come up heads a couple times in a row, then the universe will try to correct this and make a tails more likely. Because of this misunderstanding, people underestimate the likelihood of getting streaks of all heads and all tails, and I think the motivations of the comments on this answer and the main question were to correct this common mistake.

However, there is a real reason that long streaks (especially as long as 30) are increasingly unlikely. Using the language of random unbiased coin tosses, each IID (independent and identically distributed) coin toss has only 50% chance of being the same as the previous. Thus, the probability of a long streak decreases exponentially with the length of the streak. For a streak of length L, the probability of a streak of all heads is 1 in 2^L; the probability of a streak of either type is 2 in 2^L or 1 in 2^(L-1). Here is some code to demonstrate:

#include <iostream>
#include <random>
#include <map>

bool randomBool() {
    static auto gen = std::bind(std::uniform_int_distribution<>(0,1),std::default_random_engine());
    return gen();
}

int main(int argc, char *argv[]) {

    const unsigned int N = 1e8;
    std::map<unsigned int,unsigned int> histogram;
    bool current = randomBool();
    unsigned int currentLength = 1;
    for (int i = 0; i < N; ++i) {
        bool b = randomBool();
        if (b == current) {
            ++currentLength;
        } else {
            auto it = histogram.find(currentLength);
            if (it != histogram.end())
                it->second += 1;
            else
                histogram.insert(std::make_pair(currentLength,1));
            currentLength = 1;
        }
        current = b;
    }

    for (auto pair : histogram) 
        std::cout << "STREAK LENGTH " << pair.first << " OCCURS " << pair.second << " TIMES" << std::endl;
}

The output histogram is:

STREAK LENGTH 1 OCCURS 25011106 TIMES
STREAK LENGTH 2 OCCURS 12503578 TIMES
STREAK LENGTH 3 OCCURS 6249056 TIMES
STREAK LENGTH 4 OCCURS 3125508 TIMES
STREAK LENGTH 5 OCCURS 1560812 TIMES
STREAK LENGTH 6 OCCURS 781206 TIMES
STREAK LENGTH 7 OCCURS 390143 TIMES
STREAK LENGTH 8 OCCURS 194748 TIMES
STREAK LENGTH 9 OCCURS 97816 TIMES
STREAK LENGTH 10 OCCURS 48685 TIMES
STREAK LENGTH 11 OCCURS 24327 TIMES
STREAK LENGTH 12 OCCURS 12176 TIMES
STREAK LENGTH 13 OCCURS 6149 TIMES
STREAK LENGTH 14 OCCURS 3028 TIMES
STREAK LENGTH 15 OCCURS 1489 TIMES
STREAK LENGTH 16 OCCURS 811 TIMES
STREAK LENGTH 17 OCCURS 383 TIMES
STREAK LENGTH 18 OCCURS 193 TIMES
STREAK LENGTH 19 OCCURS 104 TIMES
STREAK LENGTH 20 OCCURS 43 TIMES
STREAK LENGTH 21 OCCURS 20 TIMES
STREAK LENGTH 22 OCCURS 14 TIMES
STREAK LENGTH 23 OCCURS 4 TIMES
STREAK LENGTH 24 OCCURS 3 TIMES

It is difficult to compute the expected number of streaks of length L in a number of flips N, since there are many overlapping stretches of length L where such a streak could exist. However, note that this histogram follows a roughly exponential distribution, with each entry approximately half the preceding entry.

The maximum streak is 24 [note: a bug in the previous version counted this as 23]. The probability of a streak of this length in any independent string of 24 tosses is 1 in 2^(24-1), or about 1 in 8 million. Since in 1e8 tosses there are about 1e8/24 ~ 4.3 million such separate stretches, we expect a small number of such streaks, so this seems about right [with my above caveat that calculating the exact expectation is difficult]. A streak of length 30, meanwhile, has a probability of 1 in 537 million in any independent stretch of 30 flips, and is much less likely even than a streak of length 24.

like image 174
jwimberley Avatar answered Oct 16 '22 20:10

jwimberley


bool randomBool() {
    return 0 + (rand() % (1 - 0 + 1)) == 1;
}

This is perhaps the worst possible way to convert the output of rand() into a boolean. On many implementations, the lower order bits are much less random than the upper order bits.

Ideally, you'd use something else entirely, but if you must use rand(), try:

bool randomBool() {
   return rand() > (RAND_MAX / 2);
}
like image 3
David Schwartz Avatar answered Oct 16 '22 20:10

David Schwartz