Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Truncating SciPy random distributions

Does anyone have suggestions for efficiently truncating the SciPy random distributions. For example, if I generate random values like so:

import scipy.stats as stats
print stats.logistic.rvs(loc=0, scale=1, size=1000)

How would I go about constraining the output values between 0 and 1 without changing the original parameters of the distribution and without changing the sample size, all while minimizing the amount of work the machine has to do?

like image 513
TimY Avatar asked Jul 15 '12 10:07

TimY


1 Answers

Your question is more of a statistics question than a scipy question. In general, you would need to be able to normalize over the interval you are interested in and compute the CDF for this interval analytically to create an efficient sampling method. Edit: And it turns out that this is possible (rejection sampling is not needed):

import scipy.stats as stats

import matplotlib.pyplot as plt
import numpy as np
import numpy.random as rnd

#plot the original distribution
xrng=np.arange(-10,10,.1)
yrng=stats.logistic.pdf(xrng)
plt.plot(xrng,yrng)

#plot the truncated distribution
nrm=stats.logistic.cdf(1)-stats.logistic.cdf(0)
xrng=np.arange(0,1,.01)
yrng=stats.logistic.pdf(xrng)/nrm
plt.plot(xrng,yrng)

#sample using the inverse cdf
yr=rnd.rand(100000)*(nrm)+stats.logistic.cdf(0)
xr=stats.logistic.ppf(yr)
plt.hist(xr,density=True)

plt.show()
like image 162
user1149913 Avatar answered Nov 04 '22 20:11

user1149913