I have two functions, in c++ and python, that determine how many times an event with a certain probability will occur over a number of rolls.
Python version:
def get_loot(rolls):
drops = 0
for i in range(rolls):
# getting a random float with 2 decimal places
roll = random.randint(0, 10000) / 100
if roll < 0.04:
drops += 1
return drops
for i in range(0, 10):
print(get_loot(1000000))
Python output:
371
396
392
406
384
392
380
411
393
434
c++ version:
int get_drops(int rolls){
int drops = 0;
for(int i = 0; i < rolls; i++){
// getting a random float with 2 decimal places
float roll = (rand() % 10000)/100.0f;
if (roll < 0.04){
drops++;
}
}
return drops;
}
int main()
{
srand(time(NULL));
for (int i = 0; i <= 10; i++){
cout << get_drops(1000000) << "\n";
}
}
c++ output:
602
626
579
589
567
620
603
608
594
610
626
The cood looks identical (at least to me). Both functions simulate an occurence of an event with a probablilty of 0.04 over 1,000,000 rolls. However the output of the python version is about 30% lower than that of the c++ version. How are these two versions different and why do they have different outputs?
random seed() example to generate the same random number every time. If you want to generate the same number every time, you need to pass the same seed value before calling any other random module function. Let's see how to set seed in Python pseudo-random number generator.
In C++ rand() "Returns a pseudo-random integral number in the range between 0 and RAND_MAX."
RAND_MAX
is "is library-dependent, but is guaranteed to be at least 32767 on any standard library implementation."
Let's set RAND_MAX
at 32,767.
When calculating [0, 32767) % 10000 the random number generation is skewed.
The values 0-2,767 all occur 4 times in the range (% 10000)->
Value | Calculation | Result |
---|---|---|
1 | 1 % 10000 | 1 |
10001 | 10001 % 10000 | 1 |
20001 | 20001 % 10000 | 1 |
30001 | 30001 % 10000 | 1 |
Where as the values 2,768-9,999 occur only 3 times in the range (% 10000) ->
Value | Calculation | Result |
---|---|---|
2768 | 2768 % 10000 | 2768 |
12768 | 12768 % 10000 | 2768 |
22768 | 22768 % 10000 | 2768 |
This makes the values 0-2767 25% more likely to occur than the values 2768-9,999 (assuming rand()
does, in fact, produce an even distribution between 0 and RAND_MAX).
Python on the other hand using randint produces an even distribution between start and end as randint
is an "Alias for randrange(a, b+1)"
And randrange (in python 3.2 and newer) will produce evenly distributed values:
Changed in version 3.2: randrange() is more sophisticated about producing equally distributed values. Formerly it used a style like int(random()*n) which could produce slightly uneven distributions.
There are several approaches to generating random numbers in C++. Something perhaps the most similar to python
would be to use a Mersenne Twister Engine (which is the same as python if with some differences).
Via uniform_int_distribution
with mt19937
:
#include <iostream>
#include <random>
#include <chrono>
int get_drops(int rolls) {
std::mt19937 e{
static_cast<unsigned int> (
std::chrono::steady_clock::now().time_since_epoch().count()
)
};
std::uniform_int_distribution<int> d{0, 9999};
int drops = 0;
for (int i = 0; i < rolls; i++) {
float roll = d(e) / 100.0f;
if (roll < 0.04) {
drops++;
}
}
return drops;
}
int main() {
for (int i = 0; i <= 10; i++) {
std::cout << get_drops(1000000) << "\n";
}
}
It is notable that the underlying implementation of the two engines as well as seeding and distribution are all slightly different, however, this will be much closer to python.
Alternatively as Matthias Fripp suggests scaling up rand and dividing by RAND_MAX
:
int get_drops(int rolls) {
int drops = 0;
for (int i = 0; i < rolls; i++) {
float roll = (10000 * rand() / RAND_MAX) / 100.0f;
if (roll < 0.04) {
drops++;
}
}
return drops;
}
This is also much closer to the python output (again with some differences in the way random numbers are generated in the underlying implementations).
The results are skewed because rand() % 10000
is not the correct way to achieve a uniform distribution. (See also rand() Considered Harmful by Stephan T. Lavavej.) In modern C++, prefer the pseudo-random number generation library provided in header <random>
. For example:
#include <iostream>
#include <random>
int get_drops(int rolls)
{
std::random_device rd;
std::mt19937 gen{ rd() };
std::uniform_real_distribution<> dis{ 0.0, 100.0 };
int drops{ 0 };
for(int roll{ 0 }; roll < rolls; ++roll)
{
if (dis(gen) < 0.04)
{
++drops;
}
}
return drops;
}
int main()
{
for (int i{ 0 }; i <= 10; ++i)
{
std::cout << get_drops(1000000) << '\n';
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With