Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scaled logarithmic binning in python

I'm interested in plotting the probability distribution of a set of points which are distributed as a power law. Further, I would like to use logarithmic binning to be able to smooth out the large fluctuations in the tail. If I just use logarithmic binning, and plot it on a log log scale, such as

pl.hist(MyList,log=True, bins=pl.logspace(0,3,50))
pl.xscale('log')

for example, then the problem is that the larger bins account for more points, i.e. the heights of my bins are not scaled by bin size.

Is there a way to use logarithmic binning, and yet make python scale all the heights by the size of the bin? I know I can probably do this in some roundabout fashion manually, but it seems like this should be a feature that exists, but I can't seem to find it. If you think histograms are fundamentally a bad way to represent my data and you have a better idea, then I'd love to hear that too.

Thanks!

like image 449
SarthakC Avatar asked May 11 '16 18:05

SarthakC


1 Answers

Matplotlib won't help you much if you have special requirements of your histograms. You can, however, easily create and manipulate a histogram with numpy.

import numpy as np
from matplotlib import pyplot as plt

# something random to plot
data = (np.random.random(10000)*10)**3

# log-scaled bins
bins = np.logspace(0, 3, 50)
widths = (bins[1:] - bins[:-1])

# Calculate histogram
hist = np.histogram(data, bins=bins)
# normalize by bin width
hist_norm = hist[0]/widths

# plot it!
plt.bar(bins[:-1], hist_norm, widths)
plt.xscale('log')
plt.yscale('log')

Obviously when you do present your data in a non-obvious way like this, you have to be very careful about how to label your y axis properly and write an informative figure caption.

like image 132
tjollans Avatar answered Nov 13 '22 00:11

tjollans