I'm interested in plotting the probability distribution of a set of points which are distributed as a power law. Further, I would like to use logarithmic binning to be able to smooth out the large fluctuations in the tail. If I just use logarithmic binning, and plot it on a log log scale, such as
pl.hist(MyList,log=True, bins=pl.logspace(0,3,50))
pl.xscale('log')
for example, then the problem is that the larger bins account for more points, i.e. the heights of my bins are not scaled by bin size.
Is there a way to use logarithmic binning, and yet make python scale all the heights by the size of the bin? I know I can probably do this in some roundabout fashion manually, but it seems like this should be a feature that exists, but I can't seem to find it. If you think histograms are fundamentally a bad way to represent my data and you have a better idea, then I'd love to hear that too.
Thanks!
Matplotlib won't help you much if you have special requirements of your histograms. You can, however, easily create and manipulate a histogram with numpy.
import numpy as np
from matplotlib import pyplot as plt
# something random to plot
data = (np.random.random(10000)*10)**3
# log-scaled bins
bins = np.logspace(0, 3, 50)
widths = (bins[1:] - bins[:-1])
# Calculate histogram
hist = np.histogram(data, bins=bins)
# normalize by bin width
hist_norm = hist[0]/widths
# plot it!
plt.bar(bins[:-1], hist_norm, widths)
plt.xscale('log')
plt.yscale('log')
Obviously when you do present your data in a non-obvious way like this, you have to be very careful about how to label your y axis properly and write an informative figure caption.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With