Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a parameter in matplotlib/pandas to have the Y axis of a histogram as percentage?

I would like to compare two histograms by having the Y axis show the percentage of each column from the overall dataset size instead of an absolute value. Is that possible? I am using Pandas and matplotlib. Thanks

like image 203
d1337 Avatar asked Jul 26 '13 06:07

d1337


1 Answers

The density=True (normed=True for matplotlib < 2.2.0) returns a histogram for which np.sum(pdf * np.diff(bins)) equals 1. If you want the sum of the histogram to be 1 you can use Numpy's histogram() and normalize the results yourself.

x = np.random.randn(30)  fig, ax = plt.subplots(1,2, figsize=(10,4))  ax[0].hist(x, density=True, color='grey')  hist, bins = np.histogram(x) ax[1].bar(bins[:-1], hist.astype(np.float32) / hist.sum(), width=(bins[1]-bins[0]), color='grey')  ax[0].set_title('normed=True') ax[1].set_title('hist = hist / hist.sum()') 

enter image description here

Btw: Strange plotting glitch at the first bin of the left plot.

like image 50
Rutger Kassies Avatar answered Sep 29 '22 10:09

Rutger Kassies