In an interview, I was given the following problem to solve initially using pen/paper, then via a program to verify the result.
The question is as follows:
There are three people A,B and C. Each person is capable of hitting a target with a probability of 6/7, 4/5 and 3/4 respectively. What is the probability that if they were to each fire one shot that exactly two of them will hit the target?
The answer is:
P(...) = P(A)*P(B)*(1-P(C)) +
         P(B)*P(C)*(1-P(A)) +
         P(C)*P(A)*(1-P(B))
       = 27.0/70.0
       = 38.57142857142857142857142857142857142857....%
Below is my solution to the problem:
#include <cstdio>
#include <cctype>
#include <ctime>
#include <random>
int main()
{
   std::mt19937 engine(time(0));
   engine.discard(10000000);
   std::uniform_real_distribution<double> uniform_real(0.0,1.0);
   double prA = (6.0 / 7.0);
   double prB = (4.0 / 5.0);
   double prC = (3.0 / 4.0);
   std::size_t trails = 4000000000;
   std::size_t total_success = 0;
   for (std::size_t i = 0; i < trails; ++i)
   {
      int current_success = 0;
      if (uniform_real(engine) < prA) ++current_success;
      if (uniform_real(engine) < prB) ++current_success;
      if (uniform_real(engine) < prC) ++current_success;
      if (current_success == 2)
         ++total_success;
      double prob = (total_success * 1.0) / (i+1);
      if ((i % 1000000) == 0)
      {
         printf("%05d Pr(...) = %12.10f  error:%15.13f\n",
                i,
                prob,
                std::abs((27.0/70.0) - prob));
      }
   }
   return 0;
}
The problem is as follows, regardless of how large a number of trials I run, the probability flat-lines at around roughly 0.3857002101. Is there something wrong in the code?
The interviewer said it is trivial to get the result to converge to around 9 decimal place accuracy within 1 million trials, regardless of the seed.
Any ideas on where the bug is in my code?
UPDATE 1: I've tried the above code with the following generators, they all seem to platau at around the same time roughly trial 10^9.
UPDATE 2: Thinking about the problem, I've gone down the following track. The ratio 27/70 comprised of 27 and 70 which are both coprime and where factors of 70 under 4x10^9 is roughly 57x10^6 or about 1.4% of all the numbers. Hence the probability of getting the "exact" ratio of 27/70 out of two numbers selected at random between [0,4x10^9] is roughly 1.4% (as there are more factors of 27 within 4x10^9) - So getting the exact ratio is very low, and that number will be constant regardless of the number of trials.
Now if one where to talk about thick bounds - ie: numbers in the range of factors of 70 +/5, that increases the probability of choosing a pair of numbers at random within the range [0,4x10^9] that would give a ratio within specified/related tolerance to about roughly 14%, but with this technique the best we can get will be on average roughly 5 decimal places accurate when compared to the exact value. Is this way of reasoning correct?
Firstly, some elementary math shows that it is not possible to get 9 places of precision with only a million trials. Given our probability is 27/70, we can calculate x/1000000 = 27/70 which gives x = 385714.28571. If we had a very, very accurate uniform random number generator which generated exactly 385714 correct trials, this would give us an error of approximately abs(385714/1000000 - 0.38571428571428573) = 2.857142857304318e-07 which is well off the wanted 9 places of precision. 
I don't think your analysis is correct. Given a very accurate distribution, it is certainly possible to get the required precision. However, -any- skewness from uniformity in the distribution will severely hamper the precision. If we do 1 billion trials, the best precision we can hope for is around 2.85 * 10^-10. If the distribution is skewed by even 100, this will be knocked down to about 1 * 10^-7. I'm not sure of the accuracy of most PRNG distributions, but the problem will be having one which is accurate to this degree. Having a quick play with std::uniform_real_distribution<double>(0.0, 1.0), it looks certain to have more variance than this. 
The interviewer said it is trivial to get the result to converge to around 9 decimal place accuracy within 1 million trials, regardless of the seed.
Well, that's just obviously ridiculous. You cannot get an estimate within one in a thousand million with a million trials. If the total was only one different from the theoretical value, you'd be off by one in a million, which is a thousand times bigger than "9 decimal places".
By the way, c++11 comes with a perfectly good uniform_int_distribution function, which actually handles round-off correctly: it divides the total range of the uniform generator into an exact multiple of the desired range and a remainder, and discards values generated in the remainder, so the values generated are not biased by round-off. I made a slight modification to your test program, and it does converge to six digits in a billion trials, which is about what I'd expect:
int main() {
  std::mt19937 engine(time(0));
  std::uniform_int_distribution<int> a_distr(0,6);
  std::uniform_int_distribution<int> b_distr(0,4);
  std::uniform_int_distribution<int> c_distr(0,3);
  std::size_t trials = 4000000000;
  std::size_t total_success = 0;
  for (std::size_t i = 1; i <= trials; ++i) {
    int current_success = 0;
    if (a_distr(engine)) ++current_success;
    if (b_distr(engine)) ++current_success;
    if (c_distr(engine)) ++current_success;
    if (current_success == 2) ++total_success;
    if ((i % 1000000) == 0) {
      printf("%05d Pr(...) = %12.10f  error:%15.13f\n",
             i,
             double(total_success) / i,
             std::abs((27.0/70.0) - double(total_success) / i));
    }
  }
}
return 0;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With