Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Probability Simulation Error not Converging

In an interview, I was given the following problem to solve initially using pen/paper, then via a program to verify the result.

The question is as follows:

There are three people A,B and C. Each person is capable of hitting a target with a probability of 6/7, 4/5 and 3/4 respectively. What is the probability that if they were to each fire one shot that exactly two of them will hit the target?

The answer is:

P(...) = P(A)*P(B)*(1-P(C)) +
         P(B)*P(C)*(1-P(A)) +
         P(C)*P(A)*(1-P(B))
       = 27.0/70.0
       = 38.57142857142857142857142857142857142857....%

Below is my solution to the problem:

#include <cstdio>
#include <cctype>
#include <ctime>
#include <random>


int main()
{
   std::mt19937 engine(time(0));

   engine.discard(10000000);

   std::uniform_real_distribution<double> uniform_real(0.0,1.0);

   double prA = (6.0 / 7.0);
   double prB = (4.0 / 5.0);
   double prC = (3.0 / 4.0);

   std::size_t trails = 4000000000;
   std::size_t total_success = 0;

   for (std::size_t i = 0; i < trails; ++i)
   {
      int current_success = 0;
      if (uniform_real(engine) < prA) ++current_success;
      if (uniform_real(engine) < prB) ++current_success;
      if (uniform_real(engine) < prC) ++current_success;

      if (current_success == 2)
         ++total_success;

      double prob = (total_success * 1.0) / (i+1);

      if ((i % 1000000) == 0)
      {
         printf("%05d Pr(...) = %12.10f  error:%15.13f\n",
                i,
                prob,
                std::abs((27.0/70.0) - prob));
      }
   }

   return 0;
}

The problem is as follows, regardless of how large a number of trials I run, the probability flat-lines at around roughly 0.3857002101. Is there something wrong in the code?

The interviewer said it is trivial to get the result to converge to around 9 decimal place accuracy within 1 million trials, regardless of the seed.

Any ideas on where the bug is in my code?

UPDATE 1: I've tried the above code with the following generators, they all seem to platau at around the same time roughly trial 10^9.

  1. std::mt19937_64
  2. std::ranlux48_base
  3. std::minstd_rand0

UPDATE 2: Thinking about the problem, I've gone down the following track. The ratio 27/70 comprised of 27 and 70 which are both coprime and where factors of 70 under 4x10^9 is roughly 57x10^6 or about 1.4% of all the numbers. Hence the probability of getting the "exact" ratio of 27/70 out of two numbers selected at random between [0,4x10^9] is roughly 1.4% (as there are more factors of 27 within 4x10^9) - So getting the exact ratio is very low, and that number will be constant regardless of the number of trials.

Now if one where to talk about thick bounds - ie: numbers in the range of factors of 70 +/5, that increases the probability of choosing a pair of numbers at random within the range [0,4x10^9] that would give a ratio within specified/related tolerance to about roughly 14%, but with this technique the best we can get will be on average roughly 5 decimal places accurate when compared to the exact value. Is this way of reasoning correct?

like image 320
Switzy Avatar asked Nov 23 '12 03:11

Switzy


2 Answers

Firstly, some elementary math shows that it is not possible to get 9 places of precision with only a million trials. Given our probability is 27/70, we can calculate x/1000000 = 27/70 which gives x = 385714.28571. If we had a very, very accurate uniform random number generator which generated exactly 385714 correct trials, this would give us an error of approximately abs(385714/1000000 - 0.38571428571428573) = 2.857142857304318e-07 which is well off the wanted 9 places of precision.

I don't think your analysis is correct. Given a very accurate distribution, it is certainly possible to get the required precision. However, -any- skewness from uniformity in the distribution will severely hamper the precision. If we do 1 billion trials, the best precision we can hope for is around 2.85 * 10^-10. If the distribution is skewed by even 100, this will be knocked down to about 1 * 10^-7. I'm not sure of the accuracy of most PRNG distributions, but the problem will be having one which is accurate to this degree. Having a quick play with std::uniform_real_distribution<double>(0.0, 1.0), it looks certain to have more variance than this.

like image 102
Yuushi Avatar answered Oct 16 '22 18:10

Yuushi


The interviewer said it is trivial to get the result to converge to around 9 decimal place accuracy within 1 million trials, regardless of the seed.

Well, that's just obviously ridiculous. You cannot get an estimate within one in a thousand million with a million trials. If the total was only one different from the theoretical value, you'd be off by one in a million, which is a thousand times bigger than "9 decimal places".

By the way, c++11 comes with a perfectly good uniform_int_distribution function, which actually handles round-off correctly: it divides the total range of the uniform generator into an exact multiple of the desired range and a remainder, and discards values generated in the remainder, so the values generated are not biased by round-off. I made a slight modification to your test program, and it does converge to six digits in a billion trials, which is about what I'd expect:

int main() {
  std::mt19937 engine(time(0));

  std::uniform_int_distribution<int> a_distr(0,6);
  std::uniform_int_distribution<int> b_distr(0,4);
  std::uniform_int_distribution<int> c_distr(0,3);

  std::size_t trials = 4000000000;
  std::size_t total_success = 0;

  for (std::size_t i = 1; i <= trials; ++i) {
    int current_success = 0;
    if (a_distr(engine)) ++current_success;
    if (b_distr(engine)) ++current_success;
    if (c_distr(engine)) ++current_success;

    if (current_success == 2) ++total_success;

    if ((i % 1000000) == 0) {
      printf("%05d Pr(...) = %12.10f  error:%15.13f\n",
             i,
             double(total_success) / i,
             std::abs((27.0/70.0) - double(total_success) / i));
    }
  }
}

return 0;

like image 8
rici Avatar answered Oct 16 '22 19:10

rici