Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python scipy - specify custom discrete distribution

Tags:

I use various continuous distributions from scipy.stats (e.g. norm). So if I want to find P(Z < 0.5) I would do:

from scipy.stats import norm
norm(0, 1).cdf(0.5)  # Z~N(0,1)

Is there a tool (scipy.stats or statsmodels or else) that I can use to describe a discrete distribution and then calculate CDF/CMF etc on it? I can write the code myself but I was wondering if something exists, for example:

pdf(x) = 1/3 for x = 1,2,3; else 0

Then I can construct 2 vectors x=[1,2,3], p = [1/3, 1/3, 1/3] and input them into a library class which will then provide .cdf() etc?

like image 998
s5s Avatar asked Jan 15 '17 18:01

s5s


People also ask

How do you create a discrete random variable in Python?

In Python, the random variable having integer values can be generated using the randint() function in the random module. This function takes two parameters: the lower limit & upper limit. To calculate the probability distribution of a discrete random variable, we can use the scipy. stats library in Python.

What is loc parameter in SciPy?

The location ( loc ) keyword specifies the mean. The scale ( scale ) keyword specifies the standard deviation. As an instance of the rv_continuous class, norm object inherits from it a collection of generic methods (see below for the full list), and completes them with details specific for this particular distribution.

How is CDF calculated in SciPy?

The easiest way to calculate normal CDF probabilities in Python is to use the norm. cdf() function from the SciPy library. What is this? The probability that a random variables takes on a value less than 1.96 in a standard normal distribution is roughly 0.975.

What is discrete distribution in Python?

A discrete distribution is a distribution of data in statistics that has discrete values. Discrete values are countable, finite, non-negative integers, such as 1, 10, 15, etc.


1 Answers

I guess you are looking for scipy.stats.rv_discrete here. From the docs:

rv_discrete is a base class to construct specific distribution classes and instances for discrete random variables. It can also be used to construct an arbitrary distribution defined by a list of support points and corresponding probabilities.

Example from docs:

from scipy import stats
xk = np.arange(7)
pk = (0.1, 0.2, 0.3, 0.1, 0.1, 0.0, 0.2)
custm = stats.rv_discrete(name='custm', values=(xk, pk))

And your example:

In [1]: import numpy as np

In [2]: from scipy import stats

In [3]: custm = stats.rv_discrete(name='custm', values=((1, 2, 3), (1./3, 1./3, 1./3)))

In [4]: custm.cdf(2.5)
Out[4]: 0.66666666666666663
like image 105
Andrey Sobolev Avatar answered Sep 24 '22 10:09

Andrey Sobolev