Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

One-sided truncated normal distribution in scipy

In scipy, is there any way to sample from a normal distribution that has only been truncated on one side?

Say I have a standard normal distribution, with the domain (-inf, 0].

The scipy.stats.truncnorm class provides utilities for distributions with a specific lower and upper bound, but is there a good way to do this if you only have one or the other, short of scipy.stats.truncnorm(a=-9999999, b=0, loc=0, scale=1)?

like image 441
Michael Delgado Avatar asked Apr 06 '16 01:04

Michael Delgado


People also ask

How do you create a truncated normal distribution in Python?

truncnorm() is a Truncated Normal continuous random variable. It is inherited from the of generic methods as an instance of the rv_continuous class. It completes the methods with details specific for this particular distribution. size : [tuple of ints, optional] shape or random variates.

What do mean by truncated normal distribution?

In probability and statistics, the truncated normal distribution is the probability distribution derived from that of a normally distributed random variable by bounding the random variable from either below or above (or both). The truncated normal distribution has wide applications in statistics and econometrics.

What is RVS in Scipy?

rvs(*args, **kwds)[source] Random variates of given type. Parameters arg1, arg2, arg3,… The shape parameter(s) for the distribution (see docstring of the instance object for more information).

What is Loc and scale in Scipy?

The location ( loc ) keyword specifies the mean. The scale ( scale ) keyword specifies the standard deviation. As an instance of the rv_continuous class, norm object inherits from it a collection of generic methods (see below for the full list), and completes them with details specific for this particular distribution.


2 Answers

Using np.inf (or -np.inf) to set the bounds results in the distribution being unbounded on that side:

scipy.stats.truncnorm(a=-np.inf, b=0, loc=0, scale=1)

H/t to @warren-weckesser for the answer in comments.

like image 97
Michael Delgado Avatar answered Sep 27 '22 19:09

Michael Delgado


What you are describing is the Half-normal distribution (its mirror to the y-axis to be precise), a special case where the Folded Normal Distribution and the Truncated Normal Distribution are equivalent. scipy also provides that distribution: scipy.stats.foldnorm. The only reason I'm mentioning this is (not to be pedantic) because it is quite convoluted how scipy and numpy define these distributions (see below: mu & sigma vs. beta, scale, loc, a & b)

Run this in a notebook and the differences become apparent:

import numpy as np
import scipy.stats as st
import matplotlib.pyplot as plt

resolution=10000

mu, sigma = 0.0, 0.2
normdist = lambda x: 1/(sigma * np.sqrt(2 * np.pi)) * np.exp( - (x - mu)**2 / (2 * sigma**2) )
f = np.random.normal(mu, sigma, resolution)

count, bins, ignored = plt.hist(f, 'auto', density=True)
plt.plot(bins, [normdist(x) for x in bins], 'r--', linewidth=2, label='(np)norm.dist')


beta=2.0
scale=1/np.sqrt(2)*(2*sigma)
nd = st.gennorm(beta, loc=mu, scale=scale)
x = np.linspace(nd.ppf(0.01), nd.ppf(0.99), resolution)
plt.plot(x, nd.pdf(x),'g-', lw=2, alpha=0.6, label='(sc)gennorm pdf')
print(f'mean: {nd.mean()}, std: {nd.std()}')

c=0.0
fnd = st.foldnorm(c, loc=0.0, scale=scale)

x = np.arange(-1.0, 5.0, 1/resolution)
plt.plot(x, fnd.pdf(x),'k-', lw=2, alpha=0.6, label='(sc)foldnorm pdf')
# count, bins, ignored = plt.hist(fnd.rvs(size=resolution), 'auto', density=True)

print(f'mean: {fnd.mean()}, std: {fnd.std()}')

a=0
b=np.inf
tnd = st.truncnorm(a=a, b=b, loc=0, scale=scale)
plt.plot(x, tnd.pdf(x),'b-', lw=1, alpha=0.6, label='(sc)truncnorm pdf')
# count, bins, ignored = plt.hist(tnd.rvs(size=resolution), 'auto', density=True)

plt.plot(x, [normdist(x) for x in x], 'r--', lw=1, label='(np)norm.dist')

print(f'mean: {fnd.mean()}, std: {fnd.std()}')
plt.legend()
plt.show()
like image 28
Hendrik F Avatar answered Sep 27 '22 19:09

Hendrik F