Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different Results of Monte Carlo Integration Due To Output

I've just recently stumbled upon a C++ bug/feature, that I can't fully understand and was hoping someone with a better knowledge of C++ here could point me in the right direction.

Below you will find my attempt at finding out the area under the Gaussian curve using Monte Carlo integration. The recipe is:

  1. Generate a large number of normally distributed random variables (with mean of zero and standard deviation of one).
  2. Square these numbers.
  3. Take the average of all the squares. The average will be a very close estimation of the area under the curve (in the case of the Gaussian, it is 1.0).

The code below consists of two simple functions: rand_uni, which returns a random variable, uniformly distributed between zero and one, and rand_norm, which is a (rather poor, but 'good enough for government work') approximation of a normally distributed random variable.

main runs through a loop one billion times, calling rand_norm each time, squaring it with pow and adding to the accumulating variable. After this loop the accumulated result is just divided by the number of runs and printed to the terminal as Result=<SOME NUMBER>.

The problem lies in very quirky behaviour of the code below: when each generated random variable is printed to cout (yes, one billion times), then the end result is correct regardless of the compiler used (1.0015, which is pretty close, to what I want). If I don't print the random variable in each loop iteration, I get inf under gcc and 448314 under clang.

Frankly, this just boggles the mind and since it is my first(-ish) encounter with C++, I don't really know, what the problem might be: is it something with pow? Does cout act weird?

Any hint would be much appreciated!

The Source

// Monte Carlo integration of the Gaussian curve

#include <iostream>
#include <cstdlib>
#include <cmath>


using namespace std;


enum {
  no_of_runs = 1000000
};


// uniform random variable
double rand_uni() {
  return ((double) rand() / (RAND_MAX));
};


// approximation of a normaly distributed random variable
double rand_norm() {
  double result;
  for(int i=12; i > 0; --i) {
    result += rand_uni();
  }
  return result - 6;
};


int main(const int argc,
         const char** argv) {

  double result = 0;
  double x;

  for (long i=no_of_runs; i > 0; --i) {
    x = pow(rand_norm(), 2);
    #ifdef DO_THE_WEIRD_THING
    cout << x << endl;      // MAGIC?!
    #endif
    result += x;
  }

  // Prints the end result
  cout << "Result=" 
       << result / no_of_runs
       << endl << endl;

}

The Makefile

CLANG=clang++
GCC=g++
OUT=normal_mc

default: *.cpp
    $(CLANG) -o $(OUT).clang.a *.cpp
    $(CLANG) -o $(OUT).clang.b -DDO_THE_WEIRD_THING *.cpp
    $(GCC) -o $(OUT).gcc.a *.cpp
    $(GCC) -o $(OUT).gcc.b -DDO_THE_WEIRD_THING *.cpp
like image 712
jst Avatar asked Mar 31 '13 09:03

jst


People also ask

What type of result can be generated by Monte Carlo algorithm?

Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle.

What is the output of Monte Carlo simulation?

The result of a Monte Carlo simulation is a range – or distribution – of possible outcome values. This data on possible results enables you to calculate the probabilities of different outcomes in your forecasts, as well as perform a wide range of additional analyses.

What will affect the calculation of function in Monte Carlo integration is?

If we take a random point x_i between a and b, we can multiply f(x_i) by (b-a) to get the area of a rectangle of width (b-a) and height f(x_i). The idea behind Monte Carlo integration is to approximate the integral value (gray area on figure 1 ) by the averaged area of rectangles computed for random picked x_i.


2 Answers

The result in double rand_norm() is not initialised to 0 before you start += random values to it.

All the cout's use and reset some part of the stack memory that is later used by the call of rand_norm, "fixing" your problem when you do the cout's.

Change the first line to double result = 0.0; to fix it.

double rand_norm() {
  double result = 0.0;
  for(int i=12; i > 0; --i) {
    result += rand_uni();
  }
  return result - 6;
};

Also, you should seed your random number generator, you will always get the exact same sequence of pseudo random numbers without it, add a

srand(time(NULL));

before your first call to rand(), or look at some better random number generators, e.g. Boost.Random

like image 146
Pieter Avatar answered Oct 19 '22 02:10

Pieter


In your rand_norm() function result is not initialized, that is your bug. Just in case you are wondering why it worked when you printed the values, its because an unitialized variable will have the value that the memory region it is had, in your code it is the stack, so calling cout<< changed your stack values. Here is the fixed version of your function:

double rand_norm() {
  double result = 0.0;
  for(int i=12; i > 0; --i) {
    result += rand_uni();
  }
  return result - 6;
};
like image 40
fbafelipe Avatar answered Oct 19 '22 02:10

fbafelipe