Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a random number distribution that obeys Benford's Law?

Python has a number of ways to generate different distributions of random numbers, see the documentation for the random module. Unfortunately they aren't terribly understandable without the appropriate math background, especially considering the required parameters.

I'd like to know if any of those methods are capable of producing random numbers with a distribution that obeys Benford's Law, and what parameter values are appropriate. Namely for a population of integers, those integers should start with a '1' about 30% of the time, '2' about 18% of the time, etc.


Using John Dvorak's answer, I put together the following code, and it appears to work perfectly.

def benfords_range_gen(stop, n):
    """ A generator that returns n random integers
    between 1 and stop-1 and whose distribution
    meets Benford's Law i.e. is logarithmic.
    """
    multiplier = math.log(stop)
    for i in range(n):
        yield int(math.exp(multiplier * random.random()))

>>> from collections import Counter
>>> Counter(str(i)[0] for i in benfords_range_gen(10000, 1000000))
Counter({'1': 300696, '2': 176142, '3': 124577, '4': 96756, '5': 79260, '6': 67413, '7': 58052, '8': 51308, '9': 45796})

A question has also arisen about whether this works consistently between different versions of Python. That's not a trivial question to answer, because of the nature of random numbers - you expect some variation from run to run, and sometimes between different versions of the random library. The only way to avoid that is to seed the random number generator consistently between every run. I've added that to my test and I get the exact same results in Python 2.7.1, 3.8.6, and 3.9.1.

>>> random.seed(7919)
>>> Counter(str(i)[0] for i in benfords_range_gen(10000, 1000000))
Counter({'1': 301032, '2': 176404, '3': 125350, '4': 96503, '5': 78450, '6': 67198, '7': 58000, '8': 51342, '9': 45721})
like image 643
Mark Ransom Avatar asked Jan 28 '13 06:01

Mark Ransom


1 Answers

Benford's law describes the distribution of the first digits of a set of numbers if the numbers are chosen from a wide range on the logarithmic scale. If you prepare a log-uniform distribution over one decade, it will respect the law as well. 10^[0,1) will produce that distribution.

This will produce the desired distribution: math.floor(10**random.random())

like image 153
John Dvorak Avatar answered Oct 15 '22 11:10

John Dvorak