In an interview, I was given the following problem to solve initially using pen/paper, then via a program to verify the result.
The question is as follows:
There are three people A,B and C. Each person is capable of hitting a target with a probability of 6/7, 4/5 and 3/4 respectively. What is the probability that if they were to each fire one shot that exactly two of them will hit the target?
The answer is:
P(...) = P(A)*P(B)*(1-P(C)) +
P(B)*P(C)*(1-P(A)) +
P(C)*P(A)*(1-P(B))
= 27.0/70.0
= 38.57142857142857142857142857142857142857....%
Below is my solution to the problem:
#include <cstdio>
#include <cctype>
#include <ctime>
#include <random>
int main()
{
std::mt19937 engine(time(0));
engine.discard(10000000);
std::uniform_real_distribution<double> uniform_real(0.0,1.0);
double prA = (6.0 / 7.0);
double prB = (4.0 / 5.0);
double prC = (3.0 / 4.0);
std::size_t trails = 4000000000;
std::size_t total_success = 0;
for (std::size_t i = 0; i < trails; ++i)
{
int current_success = 0;
if (uniform_real(engine) < prA) ++current_success;
if (uniform_real(engine) < prB) ++current_success;
if (uniform_real(engine) < prC) ++current_success;
if (current_success == 2)
++total_success;
double prob = (total_success * 1.0) / (i+1);
if ((i % 1000000) == 0)
{
printf("%05d Pr(...) = %12.10f error:%15.13f\n",
i,
prob,
std::abs((27.0/70.0) - prob));
}
}
return 0;
}
The problem is as follows, regardless of how large a number of trials I run, the probability flat-lines at around roughly 0.3857002101. Is there something wrong in the code?
The interviewer said it is trivial to get the result to converge to around 9 decimal place accuracy within 1 million trials, regardless of the seed.
Any ideas on where the bug is in my code?
UPDATE 1: I've tried the above code with the following generators, they all seem to platau at around the same time roughly trial 10^9.
UPDATE 2: Thinking about the problem, I've gone down the following track. The ratio 27/70 comprised of 27 and 70 which are both coprime and where factors of 70 under 4x10^9 is roughly 57x10^6 or about 1.4% of all the numbers. Hence the probability of getting the "exact" ratio of 27/70 out of two numbers selected at random between [0,4x10^9] is roughly 1.4% (as there are more factors of 27 within 4x10^9) - So getting the exact ratio is very low, and that number will be constant regardless of the number of trials.
Now if one where to talk about thick bounds - ie: numbers in the range of factors of 70 +/5, that increases the probability of choosing a pair of numbers at random within the range [0,4x10^9] that would give a ratio within specified/related tolerance to about roughly 14%, but with this technique the best we can get will be on average roughly 5 decimal places accurate when compared to the exact value. Is this way of reasoning correct?
Firstly, some elementary math shows that it is not possible to get 9 places of precision with only a million trials. Given our probability is 27/70
, we can calculate x/1000000 = 27/70
which gives x = 385714.28571
. If we had a very, very accurate uniform random number generator which generated exactly 385714 correct trials, this would give us an error of approximately abs(385714/1000000 - 0.38571428571428573) = 2.857142857304318e-07
which is well off the wanted 9 places of precision.
I don't think your analysis is correct. Given a very accurate distribution, it is certainly possible to get the required precision. However, -any- skewness from uniformity in the distribution will severely hamper the precision. If we do 1 billion trials, the best precision we can hope for is around 2.85 * 10^-10
. If the distribution is skewed by even 100, this will be knocked down to about 1 * 10^-7
. I'm not sure of the accuracy of most PRNG distributions, but the problem will be having one which is accurate to this degree. Having a quick play with std::uniform_real_distribution<double>(0.0, 1.0)
, it looks certain to have more variance than this.
The interviewer said it is trivial to get the result to converge to around 9 decimal place accuracy within 1 million trials, regardless of the seed.
Well, that's just obviously ridiculous. You cannot get an estimate within one in a thousand million with a million trials. If the total was only one different from the theoretical value, you'd be off by one in a million, which is a thousand times bigger than "9 decimal places".
By the way, c++11 comes with a perfectly good uniform_int_distribution function, which actually handles round-off correctly: it divides the total range of the uniform generator into an exact multiple of the desired range and a remainder, and discards values generated in the remainder, so the values generated are not biased by round-off. I made a slight modification to your test program, and it does converge to six digits in a billion trials, which is about what I'd expect:
int main() {
std::mt19937 engine(time(0));
std::uniform_int_distribution<int> a_distr(0,6);
std::uniform_int_distribution<int> b_distr(0,4);
std::uniform_int_distribution<int> c_distr(0,3);
std::size_t trials = 4000000000;
std::size_t total_success = 0;
for (std::size_t i = 1; i <= trials; ++i) {
int current_success = 0;
if (a_distr(engine)) ++current_success;
if (b_distr(engine)) ++current_success;
if (c_distr(engine)) ++current_success;
if (current_success == 2) ++total_success;
if ((i % 1000000) == 0) {
printf("%05d Pr(...) = %12.10f error:%15.13f\n",
i,
double(total_success) / i,
std::abs((27.0/70.0) - double(total_success) / i));
}
}
}
return 0;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With