Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Floating point math rounding weird in C++ compared to mathematica

The following post is solved,the problem occurred because of miss interpretation of the formula on http://www.cplusplus.com/reference/random/piecewise_constant_distribution/ The reader is strongly encouraged to consider the page: http://en.cppreference.com/w/cpp/numeric/random/piecewise_constant_distribution

I have the following strange phenomenon which puzzles me!:

I have a piecewise constant probability density given as

using RandomGenType = std::mt19937_64;
RandomGenType gen(51651651651);

using PREC = long double;
std::array<PREC,5> intervals {0.59, 0.7, 0.85, 1, 1.18};
std::array<PREC,4> weights {1.36814, 1.99139, 0.29116, 0.039562};

 // integral over the pdf to normalize:
PREC normalization =0;
for(unsigned int i=0;i<4;i++){
    normalization += weights[i]*(intervals[i+1]-intervals[i]);
}
std::cout << std::setprecision(30) << "Normalization: " << normalization << std::endl;
// normalize all weights (such that the integral gives 1)!
for(auto & w : weights){
    w /= normalization;
}

std::piecewise_constant_distribution<PREC>
distribution (intervals.begin(),intervals.end(),weights.begin());

When I draw n random numbers (radius of sphere in millimeters) from this distribution and compute the mass of the sphere and sum them up like:

unsigned int n = 1000000;
double density = 2400;
double mass = 0;

for(int i=0;i<n;i++){
    auto d = 2* distribution(gen) * 1e-3;
    mass += d*d*d/3.0*M_PI_2*density;
}

I get mass = 4.3283 kg (see LIVE here)

Doing the EXACT identical thing in Mathematica like:

Graphic

Gives the assumably correct value of 4.5287 kg. (see mathematica)

Which is not the same, also with different seeds , C++ and Mathematica never match! ? Is that numeric inaccuracy, which I doubt it is...? Question : What the hack is wrong with the sampling in C++?

Simple Mathematica Code:

pdf[r_] = 2*Piecewise[{{0, r < 0.59}, {1.36814, 0.59 <= r <= 0.7}, 
           {1.99139, Inequality[0.7, Less, r, LessEqual, 0.85]}, 
           {0.29116, Inequality[0.85, Less, r, LessEqual, 1]}, 
           {0.039562, Inequality[1, Less, r, LessEqual, 1.18]}, 
           {0, r > 1.18}}];

pdfr[r_] = pdf[r] / Integrate[pdf[r], {r, 0, 3}];(*normalize*)

Plot[pdf[r], {r, 0.4, 1.3}, Filling -> Axis]

PDFr = ProbabilityDistribution[pdfr[r], {r, 0, 1.18}]; 
(*if you put 1.18=2 then we dont get 4.52??*)

SeedRandom[100, Method -> "MersenneTwister"]
dataR = RandomVariate[PDFr, 1000000, WorkingPrecision -> MachinePrecision];
Fold[#1 + (2*#2*10^-3)^3  Pi/6 2400 &, 0, dataR] 

(*Analytical Solution*)

PDFr = ProbabilityDistribution[pdfr[r], {r, 0, 3}];
1000000 Integrate[ 2400 (2 InverseCDF[PDFr, p] 10^-3)^3 Pi/6, {p, 0, 1}]

Update: I did some analysis:

  1. Read in the numbers (64bit doubles) generated from Mathematica into C++ -> calculated the sum and it gives the same as Mathematica
    Mass computed by reduction: 4.52528010260687096888432279229

  2. Read in the numbers generated from C++ (64bit double) into Mathematica -> calculated the sum and it gives the same 4.32402

  3. I almost conclude the sampling with std::piecewise_constant_distribution is inaccurate (or as accurate as it gets with 64bit floats) or has a bug... OR there is something wrong with my weights?

  4. Densities are calculated wrongly std::piecewise_constant_distribution in http://coliru.stacked-crooked.com/a/ca171bf600b5148f ===> It seems to be a bug!

Histogramm Plot of CPP Generated values compared to the wanted Distribution: Histogramm

file = NotebookDirectory[] <> "numbersCpp.bin";
dataCPP = BinaryReadList[file, "Real64"];
Hpdf = HistogramDistribution[dataCPP];
h = DiscretePlot[  PDF[ Hpdf, x], {x, 0.4, 1.2, 0.001}, 
   PlotStyle -> Red];
Show[h, p, PlotRange -> All]

The file is generated here: Number generation CPP

like image 585
Gabriel Avatar asked Mar 04 '15 19:03

Gabriel


People also ask

What causes floating point rounding errors?

Because floating-point numbers have a limited number of digits, they cannot represent all real numbers accurately: when there are more digits than the format allows, the leftover ones are omitted - the number is rounded.

What is rounding in floating point?

In floating point arithmetic, two extra bits are used to the far right of the significand, called the guard and round bits. At the end of the arithmetic calculation, these bits are rounded off. We always round towards the closer digit (i.e. 0.00-‐0.49 → 0 and 0.51-‐0.99 → 1).


2 Answers

It seems that the formula for the probabilities is wrongly written for std::piecewise_constant_distribution on http://www.cplusplus.com/reference/random/piecewise_constant_distribution/

The summation of the weights is done without the interval lengths multiplied!

The correct formula is: http://en.cppreference.com/w/cpp/numeric/random/piecewise_constant_distribution

This solves every stupid quirk previously discovered as bug/floating point error and so on!

like image 111
Gabriel Avatar answered Oct 16 '22 05:10

Gabriel


[The following paragraph was edited for correctness. --Editor's note]

Mathematica may or may not use IEEE 754 floating point numbers. From the Wolfram documentation:

The Wolfram Language has sophisticated built-in automatic numerical precision and accuracy control. But for special-purpose optimization of numerical computations, or for studying numerical analysis, the Wolfram Language also allows detailed control over precision and accuracy.

and

The Wolfram Language handles both integers and real numbers with any number of digits, automatically tagging numerical precision when appropriate. The Wolfram Language internally uses several highly optimized number representations, but nevertheless provides a uniform interface for digit and precision manipulation, while allowing numerical analysts to study representation details when desired.

like image 2
Paul Evans Avatar answered Oct 16 '22 05:10

Paul Evans