This is a very basic question, but I can't seem to find a good answer. What exactly does scipy calculate for
scipy.stats.norm(50,10).pdf(45)
I understand that the probability of a particular value like 45 in a gaussian with mean 50 and std dev 10 is 0. So what exactly is pdf calculating? Is it the area under the gaussian curve, and if so, what is the range of values on the x axis?
Percent point function (inverse of cdf ) at q of the given RV.
Normal or Gaussian distribution is a continuous probability distribution that has a bell-shaped probability density function (Gaussian function), or informally a bell curve. The frequency distribution plot of Table 9.2 and Fig.
Python Scipy scipy. stats. norm object is used to analyze the normal distribution and calculate its different distribution function values using the different methods available.
In probability theory, a probability density function (PDF) is used to define the random variable's probability coming within a distinct range of values, as opposed to taking on any one value. The function explains the probability density function of normal distribution and how mean and deviation exists.
The probability density function of the normal distribution expressed in Python is
from math import pi
from math import exp
from scipy import stats
def normal_pdf(x, mu, sigma):
return 1.0 / (sigma * (2.0 * pi)**(1/2)) * exp(-1.0 * (x - mu)**2 / (2.0 * (sigma**2)))
(compare that to the wikipedia definition). And this is exactly what scipy.stats.norm().pdf()
computes: the value of the pdf at point x
for a given mu, sigma
.
Note that this is not a probability (= area under the pdf) but rather the value of the pdf at the point x
you pass to pdf(x)
(and that value can very well be greater than 1.0
!). You can see that, for example, for N(0, 0.1)
at x = 0
:
val = stats.norm(0, 0.1).pdf(0)
print(val)
val = normal_pdf(0, 0, 0.1)
print(val)
which gives the output
3.98942280401
3.989422804014327
Not at all a probability = area under the curve!
Note that this doesn't contradict the statement that the probability of particular value like x = 0
is 0
because, formally, the area under the pdf for a point (i.e., an interval of length 0
) is zero (if f is a continuous function on [a, b] and F is its antiderivative on [a, b], then the definite integral of f over [a, b] = F(a) - F(b). Here, a = b = x
hence the value of the integral is F(x) - F(x) = 0
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With