I use various continuous distributions from scipy.stats (e.g. norm). So if I want to find P(Z < 0.5) I would do:
from scipy.stats import norm
norm(0, 1).cdf(0.5) # Z~N(0,1)
Is there a tool (scipy.stats or statsmodels or else) that I can use to describe a discrete distribution and then calculate CDF/CMF etc on it? I can write the code myself but I was wondering if something exists, for example:
pdf(x) = 1/3 for x = 1,2,3; else 0
Then I can construct 2 vectors x=[1,2,3], p = [1/3, 1/3, 1/3] and input them into a library class which will then provide .cdf() etc?
In Python, the random variable having integer values can be generated using the randint() function in the random module. This function takes two parameters: the lower limit & upper limit. To calculate the probability distribution of a discrete random variable, we can use the scipy. stats library in Python.
The location ( loc ) keyword specifies the mean. The scale ( scale ) keyword specifies the standard deviation. As an instance of the rv_continuous class, norm object inherits from it a collection of generic methods (see below for the full list), and completes them with details specific for this particular distribution.
The easiest way to calculate normal CDF probabilities in Python is to use the norm. cdf() function from the SciPy library. What is this? The probability that a random variables takes on a value less than 1.96 in a standard normal distribution is roughly 0.975.
A discrete distribution is a distribution of data in statistics that has discrete values. Discrete values are countable, finite, non-negative integers, such as 1, 10, 15, etc.
I guess you are looking for scipy.stats.rv_discrete
here. From the docs:
rv_discrete
is a base class to construct specific distribution classes and instances for discrete random variables. It can also be used to construct an arbitrary distribution defined by a list of support points and corresponding probabilities.
Example from docs:
from scipy import stats
xk = np.arange(7)
pk = (0.1, 0.2, 0.3, 0.1, 0.1, 0.0, 0.2)
custm = stats.rv_discrete(name='custm', values=(xk, pk))
And your example:
In [1]: import numpy as np
In [2]: from scipy import stats
In [3]: custm = stats.rv_discrete(name='custm', values=((1, 2, 3), (1./3, 1./3, 1./3)))
In [4]: custm.cdf(2.5)
Out[4]: 0.66666666666666663
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With