I want to specify the probability density function of a distribution and then pick up N random numbers from that distribution in Python. How do I go about doing that?

In general, you want to have the inverse cumulative probability density function. Once you have that, then generating the random numbers along the distribution is simple: <pre class="prettyprint"><code>import random def sample(n): return [ icdf(random.random()) for _ in range(n) ] </code></pre> Or, if you use NumPy: <pre class="prettyprint"><code>import numpy as np def sample(n): return icdf(np.random.random(n)) </code></pre> In both cases <code>icdf</code> is the inverse cumulative distribution function which accepts a value between 0 and 1 and outputs the corresponding value from the distribution. To illustrate the nature of <code>icdf</code>, we'll take a simple uniform distribution between values 10 and 12 as an example: <ul> <li>probability distribution function is 0.5 between 10 and 12, zero elsewhere</li> <li>cumulative distribution function is 0 below 10 (no samples below 10), 1 above 12 (no samples above 12) and increases linearly between the values (integral of the PDF)</li> <li>inverse cumulative distribution function is only defined between 0 and 1. At 0 it is 10, at 12 it is 1, and changes linearly between the values</li> </ul> Of course, the difficult part is obtaining the inverse cumulative density function. It really depends on your distribution, sometimes you may have an analytical function, sometimes you may want to resort to interpolation. Numerical methods may be useful, as numerical integration can be used to create the CDF and interpolation can be used to invert it.

Generating random numbers with a given probability density function

Q: How do you generate a random number from a CDF?

Example 0: y=1 Now, to invert the cdf, we flip x and y, and then solve for y again. It's trivially easy… , we can generate uniform random numbers, plug them into that equation as x and get y which is the actual value drawn from our PDF.

Q: How do you find the probability density function of a random variable?

The probability density function (pdf) f(x) of a continuous random variable X is defined as the derivative of the cdf F(x): f(x)=ddxF(x).

2 Answers

In general, you want to have the inverse cumulative probability density function. Once you have that, then generating the random numbers along the distribution is simple:

Click to copy

import random

def sample(n):
    return [ icdf(random.random()) for _ in range(n) ]

Or, if you use NumPy:

Click to copy

import numpy as np

def sample(n):
    return icdf(np.random.random(n))

In both cases icdf is the inverse cumulative distribution function which accepts a value between 0 and 1 and outputs the corresponding value from the distribution.

To illustrate the nature of icdf, we'll take a simple uniform distribution between values 10 and 12 as an example:

probability distribution function is 0.5 between 10 and 12, zero elsewhere
cumulative distribution function is 0 below 10 (no samples below 10), 1 above 12 (no samples above 12) and increases linearly between the values (integral of the PDF)
inverse cumulative distribution function is only defined between 0 and 1. At 0 it is 10, at 12 it is 1, and changes linearly between the values

Of course, the difficult part is obtaining the inverse cumulative density function. It really depends on your distribution, sometimes you may have an analytical function, sometimes you may want to resort to interpolation. Numerical methods may be useful, as numerical integration can be used to create the CDF and interpolation can be used to invert it.

answered Sep 19 '22 08:09

DrV

This is my function to retrieve a single random number distributed according to the given probability density function. I used a Monte-Carlo like approach. Of course n random numbers can be generated by calling this function n times.

Click to copy

    """
    Draws a random number from given probability density function.

    Parameters
    ----------
        pdf       -- the function pointer to a probability density function of form P = pdf(x)
        interval  -- the resulting random number is restricted to this interval
        pdfmax    -- the maximum of the probability density function
        integers  -- boolean, indicating if the result is desired as integer
        max_iterations -- maximum number of 'tries' to find a combination of random numbers (rand_x, rand_y) located below the function value calc_y = pdf(rand_x).

    returns a single random number according the pdf distribution.
    """
    def draw_random_number_from_pdf(pdf, interval, pdfmax = 1, integers = False, max_iterations = 10000):
        for i in range(max_iterations):
            if integers == True:
                rand_x = np.random.randint(interval[0], interval[1])
            else:
                rand_x = (interval[1] - interval[0]) * np.random.random(1) + interval[0] #(b - a) * random_sample() + a

            rand_y = pdfmax * np.random.random(1) 
            calc_y = pdf(rand_x)

            if(rand_y <= calc_y ):
                return rand_x

        raise Exception("Could not find a matching random number within pdf in " + max_iterations + " iterations.")

In my opinion this solution is performing better than other solutions if you do not have to retrieve a very large number of random variables. Another benefit is that you only need the PDF and avoid calculating the CDF, inverse CDF or weights.

answered Sep 20 '22 08:09

linksfate

Related questions
                            
                                Python Eyed3 Warning
                            
                                How can I inspect what SUDs is generating/receiving in "sudo 0.4.1 jurko 5" and newer?
                            
                                How to flatten a list with various data types (int, tuple)
                            
                                OSX Pillow Incompatible library version libtiff.5.dylib & libjpeg.8.dylib
                            
                                How do I remove \n from my python dictionary?
                            
                                Sort a pandas datetime index
                            
                                How do you iterate through a gzipped carriage-return file using python 2.7?
                            
                                Find number of zeros before non-zero in a numpy array
                            
                                How to extract 128x128 icon bitmap data from EXE in python
                            
                                Python profiling - What are the columns in runsnakerun output?
                            
                                Split DatetimeIndex into date and time MultiIndex conveniently in Pandas
                            
                                How to create an activity plot from Pandas Dataframe (like the Github contribution plot)
                            
                                Python : Adding a code routine at each line of a block of code
                            
                                How to get IP address of the launched instance with Boto
                            
                                Celery dies with DBPageNotFoundError
                            
                                sqlalchemy, mixins, foreignkeys and declared_attr
                            
                                Subtract subgroup averages from individuals without resorting to for loop
                            
                                Efficiently grouping a list of coordinates points by location in Python
                            
                                Matplotlib timelines
                            
                                ImportError: "No modules named". But modules already installed in dist-packages

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Generating random numbers with a given probability density function

Tags:

python

numpy

probability

scipy

user2445465

People also ask

2 Answers

DrV

linksfate

Recent Activity

Donate For Us