Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does scipy.norm.pdf sometimes give PDF > 1? How to correct it?

Given mean and variance of a Gaussian (normal) random variable, I would like to compute its probability density function (PDF).

enter image description here

I referred this post: Calculate probability in normal distribution given mean, std in Python,

Also the scipy docs: scipy.stats.norm

But when I plot a PDF of a curve, the probability exceeds 1! Refer to this minimum working example:

import numpy as np
import scipy.stats as stats

x = np.linspace(0.3, 1.75, 1000)
plt.plot(x, stats.norm.pdf(x, 1.075, 0.2))
plt.show()

This is what I get:

Gaussian PDF Curve

How is it even possible to have 200% probability to get the mean, 1.075? Am I misinterpreting anything here? Is there any way to correct this?

like image 201
Ébe Isaac Avatar asked Jul 01 '16 09:07

Ébe Isaac


People also ask

How to generate normal distribution in Python scipy?

The Python Scipy library has a module scipy.stats that contains an object norm which generates all kinds of normal distribution such as CDF, PDF, etc. The normal distribution is a way to measure the spread of the data around the mean. It is symmetrical with half of the data lying left to the mean and half right to the mean in a symmetrical fashion.

How many code examples of SciPy are there?

The following are 30 code examples of scipy.stats.norm.pdf () . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may also want to check out all available functions/classes of the module scipy.stats.norm , or try the search function .

What is half-normal distribution in SciPy?

The scipy.stats.halfnorm represents the random variable that is half normally continuous. It has different kinds of functions to generate half-normal distribution like CDF, PDF, median, etc. The half-normal distribution is truncated normal or folded normal distribution.

What is the difference between Scipy stats stats and logpdf?

scipy.stats.lognorm.stats (): It is used to get the standard deviation, mean, kurtosis, and skew. scipy.stats.lognorm.logPDF (): It is used to get the log related to the probability density function. scipy.stats.lognorm.logCDF (): It is used to find the log related to the cumulative distribution function.


1 Answers

It's not a bug. It's not an incorrect result either. Probability density function's value at some specific point does not give you probability; it is a measure of how dense the distribution is around that value. For continuous random variables, the probability at a given point is equal to zero. Instead of p(X = x), we calculate probabilities between 2 points p(x1 < X < x2) and it is equal to the area below that probability density function. Probability density function's value can very well be above 1. It can even approach to infinity.

like image 196
ayhan Avatar answered Oct 28 '22 15:10

ayhan