Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating truncated negative binomial distribution in python

I am trying to generate datasets following truncated negative binomial distribution consisting of numbers such that the number set has a max value.

def truncated_Nbinom(n, p, max_value, size):
    import scipy.stats as sct
    temp_size = size
    while True:
        temp_size *= 2
        temp = sct.nbinom.rvs(n, p, size=temp_size)
        truncated = temp[temp <= max_value]
        if len(truncated) >= size:
            return truncated[:size]

I am able to get results when the max_value and n are smaller. However when I try with:

input_1= truncated_Nbinom(99, 0.3, 99, 5000).tolist()

The kernel keeps dying. I tried to change the port of python and raising the recursion limit, but they didn't work. Do you have any ideas to make my code faster?

like image 863
Gözde Filiz Avatar asked Jun 14 '26 22:06

Gözde Filiz


1 Answers

Here is one approach. You can compute the probability of x being selected under the negative binomial, then normalize the probabilities for xs below max_value to sum to one. Now, you can simply call np.random.choice with appropriate probabilities.

import numpy as np
import pandas as pd
from scipy import stats


def truncated_Nbinom2(n, p, max_value, size):
  support = np.arange(max_value + 1)
  probs = stats.nbinom.pmf(support, n, p)
  probs /= probs.sum()
  return np.random.choice(support, size=size, p=probs)

Here is an illustration:

arr1 = truncated_Nbinom(9, 0.3, 9, 50000)
arr2 = truncated_Nbinom2(9, 0.3, 9, 50000)

df_counts = pd.DataFrame({
    "version_1": pd.Series(arr1).value_counts(),
    "version_2": pd.Series(arr2).value_counts(),
})

enter image description here

like image 51
hilberts_drinking_problem Avatar answered Jun 17 '26 12:06

hilberts_drinking_problem



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!