Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating a random number based on a distribution function

We can generate random numbers in an interval [a,b] easily if we want to make it uniformely:

A=rand()*(b-a) + a

where rand() is a function which can generate a uniform random number between 0 and 1. so A is a random number in [a,b].

For generating a random number based on a distribution function like y=x-x^2, i faced a problem.

I would like to use a method mentioned here. But i am not interested to use the python function inverse_cdf(np.random.uniform()).

I can compute the CDF of function "y" by an integration over 0 and X and i call it "f". But when i put the rand() function(a number between 0 and 1) into the inverse function of f, i get a complex number! It means: A=f^(-1) (rand()) returns a complex number.

Is it a correct way for generating a random number based on a distribution function?

I used this website to compute the inverse of f=x^2/2 - x^3/3 and the code below is a part of the calculation that shows that the tmp1 is always negative

for i=1:10
rnd1=rand;
tmp1  = 2*sqrt(6)*sqrt(6*rnd1^2-rnd1)-12*rnd1+1
cTmp1 = tmp1^(1/3)
end
like image 935
Denis Avatar asked May 18 '26 18:05

Denis


2 Answers

The question is: "Is it a correct way for generating a random number based on a distribution function?" Thus, basically you have either continuous Probability Density Function (PDF) or discrete Probability Mass Function (PMF) with notation f(x) and you are are looking for a way to find random variate x.

There are at least two ways to do this.

  1. To use Inverse Transform Distribution.
  2. To use Rejection method

Using Inverse transform: If we know the function of probability distribution, then for some Cumulative Distribution Function (CDF) we can find the closed from of random variate. Suppose your probability function is f(x) and the CDF is F(x) then assuming you can get the inverse function, you can get random variate

x=inverse F(U)

where U is random uniform distribution

Using Rejection Method: If the CDF does not have a closed form inverse, then you can always use rejection method. The idea of rejection method is to generate a random 2D point (a pair of random number): (r1, r2) and then the point is either under the curve or over the curve of the PDF. If the point is under the curve, then we take this value as our random number, otherwise we sample another point. Suppose PDF f(x) is bounded by a maximum value M

r1 is generated within interval [a, b] of horizontal axis
r2 is generated within interval [0, M] of vertical axis
If r2 <f (r1) then accept r1 and output r1
   Else reject r1 and generate new random point

Inverse Transform method is superior to the Rejection method if you can find the inverse of the CDF because you can get the closed form. For instance, CDF of Exponential distribution with rate 1 is F(x) = 1-exp(-x). The inverse transform would be

x= inverse F(U) = -ln(1-U) = -ln(U)

because 1-U2=U2

like image 159
Kardi Teknomo Avatar answered May 21 '26 13:05

Kardi Teknomo


In the past few years, nice new tools have been added to SciPy to address this kind of problem in Python. You can easily generate samples from custom continuous or discrete univariate distributions by just providing some information about the distribution, such as the density / pdf.

Overview of the different methods: https://docs.scipy.org/doc/scipy/reference/stats.sampling.html

Tutorial: https://docs.scipy.org/doc/scipy/tutorial/stats/sampling.html

Here is an example for a custom density:

import numpy as np
from scipy.stats.sampling import NumericalInversePolynomial
from matplotlib import pyplot as plt
from scipy.integrate import quad


class MyDist:
    def __init__(self, a):
        self.a = a

    def support(self):
        # distribution restricted to 0, 5, can be changed as needed
        return (0, 5)

    def pdf(self, x):
        # this is not a proper pdf, the normalizing
        # constant is missing (does not integrate to one)
        return x * (x + np.sin(5*x) + 2) * np.exp(-x**self.a)


dist = MyDist(0.5)
gen = NumericalInversePolynomial(dist)

# compute the missing normalizing constant to plot the pdf
const_pdf = quad(dist.pdf, *dist.support())[0]

r = gen.rvs(size=50000)
x = np.linspace(r.min(), r.max(), 500)

# show histogram together with the pdf
plt.plot(x, dist.pdf(x) / const_pdf)
plt.hist(r, density=True, bins=100)
plt.show()

Generated samples together with the pdf

like image 40
Christoph Baumgarten Avatar answered May 21 '26 15:05

Christoph Baumgarten



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!