Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Standard Normal Distribution z-value function in C#

Tags:

I been looking at the recent blog post by Jeff Atwood on Alternate Sorting Orders. I tried to convert the code in the post to C# but I ran into an issue. There is no function in .NET that I know of that will return the z-value, given the percentage of area under the standard normal curve. The recommended values to use for the algorithm are 95% and 97.5% which you can look up on the z-value table in any statistics book.

Does anyone know how to implement such a function for all values of z or at least to 6 standard deviations from the mean. One way would be to hard code the values into a dictionary and use a look up but there has to be a way of calculating the exact value. My attempt at solving this was to take a definite integral of the standard normal curve function.

y = (1 / (sqrt(2 * PI))) * e^(-(1/2) * x^2)

This gives me the area under the curve between two x values but then I am stuck… Maybe I am way of base and this is not how you would do it?

Thanks.

like image 600
Lukasz Avatar asked Nov 02 '09 18:11

Lukasz


People also ask

How do you find the Z value in a normal distribution?

z = (x – μ) / σ Assuming a normal distribution, your z score would be: z = (x – μ) / σ = (190 – 150) / 25 = 1.6.

What is Z in standard normal distribution?

While data points are referred to as x in a normal distribution, they are called z or z-scores in the z-distribution. A z-score is a standard score that tells you how many standard deviations away from the mean an individual value (x) lies: A positive z-score means that your x-value is greater than the mean.


2 Answers

Here's some code for the normal distribution written in Python, but it could easily be translated to C# by adding some punctuation. It's just about 15 lines of code.

like image 122
John D. Cook Avatar answered Sep 23 '22 07:09

John D. Cook


Here's a C# translation of the normal quantile C code used in the stats program R.

/// <summary>
/// Quantile function (Inverse CDF) for the normal distribution.
/// </summary>
/// <param name="p">Probability.</param>
/// <param name="mu">Mean of normal distribution.</param>
/// <param name="sigma">Standard deviation of normal distribution.</param>
/// <param name="lower_tail">If true, probability is P[X <= x], otherwise P[X > x].</param>
/// <param name="log_p">If true, probabilities are given as log(p).</param>
/// <returns>P[X <= x] where x ~ N(mu,sigma^2)</returns>
/// <remarks>See https://svn.r-project.org/R/trunk/src/nmath/qnorm.c</remarks>
public static double QNorm(double p, double mu, double sigma, bool lower_tail, bool log_p)
{
  if (double.IsNaN(p) || double.IsNaN(mu) || double.IsNaN(sigma)) return (p + mu + sigma);
  double ans;
  bool isBoundaryCase = R_Q_P01_boundaries(p, double.NegativeInfinity, double.PositiveInfinity, lower_tail, log_p, out ans);
  if (isBoundaryCase) return (ans);
  if (sigma < 0) return (double.NaN);
  if (sigma == 0) return (mu);

  double p_ = R_DT_qIv(p, lower_tail, log_p);
  double q = p_ - 0.5;
  double r, val;

  if (Math.Abs(q) <= 0.425)  // 0.075 <= p <= 0.925
  {
    r = .180625 - q * q;
    val = q * (((((((r * 2509.0809287301226727 +
               33430.575583588128105) * r + 67265.770927008700853) * r +
             45921.953931549871457) * r + 13731.693765509461125) * r +
           1971.5909503065514427) * r + 133.14166789178437745) * r +
         3.387132872796366608)
    / (((((((r * 5226.495278852854561 +
             28729.085735721942674) * r + 39307.89580009271061) * r +
           21213.794301586595867) * r + 5394.1960214247511077) * r +
         687.1870074920579083) * r + 42.313330701600911252) * r + 1.0);
  }
  else
  {
    r = q > 0 ? R_DT_CIv(p, lower_tail, log_p) : p_;
    r = Math.Sqrt(-((log_p && ((lower_tail && q <= 0) || (!lower_tail && q > 0))) ? p : Math.Log(r)));

    if (r <= 5)              // <==> min(p,1-p) >= exp(-25) ~= 1.3888e-11
    {
      r -= 1.6;
      val = (((((((r * 7.7454501427834140764e-4 +
              .0227238449892691845833) * r + .24178072517745061177) *
            r + 1.27045825245236838258) * r +
           3.64784832476320460504) * r + 5.7694972214606914055) *
         r + 4.6303378461565452959) * r +
        1.42343711074968357734)
       / (((((((r *
                1.05075007164441684324e-9 + 5.475938084995344946e-4) *
               r + .0151986665636164571966) * r +
              .14810397642748007459) * r + .68976733498510000455) *
            r + 1.6763848301838038494) * r +
           2.05319162663775882187) * r + 1.0);
    }
    else                     // very close to  0 or 1 
    {
      r -= 5.0;
      val = (((((((r * 2.01033439929228813265e-7 +
              2.71155556874348757815e-5) * r +
             .0012426609473880784386) * r + .026532189526576123093) *
           r + .29656057182850489123) * r +
          1.7848265399172913358) * r + 5.4637849111641143699) *
        r + 6.6579046435011037772)
       / (((((((r *
                2.04426310338993978564e-15 + 1.4215117583164458887e-7) *
               r + 1.8463183175100546818e-5) * r +
              7.868691311456132591e-4) * r + .0148753612908506148525)
            * r + .13692988092273580531) * r +
           .59983220655588793769) * r + 1.0);
    }
    if (q < 0.0) val = -val;
  }

  return (mu + sigma * val);
}

Some helper methods:

private static bool R_Q_P01_boundaries(double p, double _LEFT_, double _RIGHT_, bool lower_tail, bool log_p, out double ans)
{
  if (log_p)
  {
    if (p > 0.0)
    {
      ans = double.NaN;
      return (true);
    }
    if (p == 0.0)
    {
      ans = lower_tail ? _RIGHT_ : _LEFT_;
      return (true);
    }
    if (p == double.NegativeInfinity)
    {
      ans = lower_tail ? _LEFT_ : _RIGHT_;
      return (true);
    }
  }
  else
  {
    if (p < 0.0 || p > 1.0)
    {
      ans = double.NaN;
      return (true);
    }
    if (p == 0.0)
    {
      ans = lower_tail ? _LEFT_ : _RIGHT_;
      return (true);
    }
    if (p == 1.0)
    {
      ans = lower_tail ? _RIGHT_ : _LEFT_;
      return (true);
    }
  }
  ans = double.NaN;
  return (false);
}

private static double R_DT_qIv(double p, bool lower_tail, bool log_p)
{
  return (log_p ? (lower_tail ? Math.Exp(p) : -ExpM1(p)) : R_D_Lval(p, lower_tail));
}

private static double R_DT_CIv(double p, bool lower_tail, bool log_p)
{
  return (log_p ? (lower_tail ? -ExpM1(p) : Math.Exp(p)) : R_D_Cval(p, lower_tail));
}

private static double R_D_Lval(double p, bool lower_tail) 
{
  return lower_tail ? p : 0.5 - p + 0.5; 
} 

private static double R_D_Cval(double p, bool lower_tail) 
{ 
  return lower_tail ? 0.5 - p + 0.5 : p;
}
private static double ExpM1(double x) 
{
  if (Math.Abs(x) < 1e-5)
     return x + 0.5 * x * x;
  else
     return Math.Exp(x) - 1.0;
 }

In your case, you want mu=0.0, sigma=1.0, lower_tail=true, log_p=false.

like image 30
Richie Cotton Avatar answered Sep 19 '22 07:09

Richie Cotton