I'm experimenting with fitting a power law to empirical data using the powerlaw module. I have created the following data that follows a power law distribution of exponent 2:
x = range(1,1000)
y = []
for i in x:
y.append(i**(-2))
I'm expecting the fitted power law to have an exponent of 2. However the resulting exponent deviates from the theoretical value a lot:
fitted_pl = powerlaw.Fit(y)
fitted_pl.alpha
Out[115]: 1.4017584065981563
Could you please advise why this happens, or point out what I've done wrong here?
Thank you for your kind answer!
We can perform curve fitting for our dataset in Python. The SciPy open source library provides the curve_fit() function for curve fitting via nonlinear least squares. The function takes the same input and output data as arguments, as well as the name of the mapping function to use.
The power law can be used to describe a phenomenon where a small number of items is clustered at the top of a distribution (or at the bottom), taking up 95% of the resources. In other words, it implies a small amount of occurrences is common, while larger occurrences are rare.
As @DSM pointed out, the powerlaw module deals with fitting an exponent to values drawn/generated from a power law distribution, rather than fitting a regression. To help people who might have similar confusions, below is how one should verify the exponent fitting:
## use a proper power law random number generator (or code your own)
from networkx.utils import powerlaw_sequence
pl_sequence = powerlaw_sequence(1000,exponent=2.5)
fitted_pl = powerlaw.Fit(pl_sequence)
fitted_pl.alpha
Out[73]: 2.4709012785346314 ##close enough
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With