Is there a general way to join SciPy (or NumPy) probability distributions to create a mixture probability distribution which can then be sampled from?
I have such a distribution for display using something like:
mixture_gaussian = (norm.pdf(x_axis, -3, 1) + norm.pdf(x_axis, 3, 1)) / 2
which if then plotted looks like:
However, I can't sample from this generated model, as it's just a list of points which will plot as the curve.
Note, this specific distribution is just a simple example. I'd like to be able to generate several kinds of distributions (including "sub"-distributions which are not just normal distributions). Ideally, I would hope there would be someway for the function to be automatically normalized (i.e. not having to do the / 2
explicitly as in the code above.
Does SciPy/NumPy provide some way of easily accomplishing this?
This answer provides a way that such a sampling from a multiple distributions could be done, but it certainly requires a bit of handcrafting for a given mixture distribution, especially when wanting to weight different "sub"-distributions differently. This is usable, but I would hope for method that's a bit cleaner and straight forward if possible. Thanks!
One common method of consolidating two probability distributions is to simply average them - for every set of values A, set If the distributions both have densities, for example, averaging the probabilities results in a probability distribution with density the average of the two input densities (Figure 1).
Following @PaulPanzer's pointer in the comments, I created the following subclass for easily creating mixture models from the SciPy distributions. Note, the pdf
is not required for my question, but it was nice for me to have.
class MixtureModel(rv_continuous):
def __init__(self, submodels, *args, **kwargs):
super().__init__(*args, **kwargs)
self.submodels = submodels
def _pdf(self, x):
pdf = self.submodels[0].pdf(x)
for submodel in self.submodels[1:]:
pdf += submodel.pdf(x)
pdf /= len(self.submodels)
return pdf
def rvs(self, size):
submodel_choices = np.random.randint(len(self.submodels), size=size)
submodel_samples = [submodel.rvs(size=size) for submodel in self.submodels]
rvs = np.choose(submodel_choices, submodel_samples)
return rvs
mixture_gaussian_model = MixtureModel([norm(-3, 1), norm(3, 1)])
x_axis = np.arange(-6, 6, 0.001)
mixture_pdf = mixture_gaussian_model.pdf(x_axis)
mixture_rvs = mixture_gaussian_model.rvs(10)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With