plotting a histogram on a Log scale with Matplotlib

I have a pandas DataFrame that has the following values in a Series

x = [2, 1, 76, 140, 286, 267, 60, 271, 5, 13, 9, 76, 77, 6, 2, 27, 22, 1, 12, 7, 19, 81, 11, 173, 13, 7, 16, 19, 23, 197, 167, 1] 

I was instructed to plot two histograms in a Jupyter notebook with Python 3.6. No sweat right?

x.plot.hist(bins=8) plt.show() 

I chose 8 bins because that looked best to me. I have also been instructed to plot another histogram with the log of x.

x.plot.hist(bins=8) plt.xscale('log') plt.show() 

This histogram looks TERRIBLE. Am I not doing something right? I've tried fiddling around with the plot, but everything I've tried just seems to make the histogram look even worse. Example:

x.plot(kind='hist', logx=True) 

I was not given any instructions other than plot the log of X as a histogram.

I really appreciate any help!!!

For the record, I have imported pandas, numpy, and matplotlib and specified that the plot should be inline.

1 Answers

Specifying bins=8 in the hist call means that the range between the minimum and maximum value is divided equally into 8 bins. What is equal on a linear scale is distorted on a log scale.

What you could do is specify the bins of the histogram such that they are unequal in width in a way that would make them look equal on a logarithmic scale.

import pandas as pd import numpy as np import matplotlib.pyplot as plt  x = [2, 1, 76, 140, 286, 267, 60, 271, 5, 13, 9, 76, 77, 6, 2, 27, 22, 1, 12, 7,       19, 81, 11, 173, 13, 7, 16, 19, 23, 197, 167, 1] x = pd.Series(x)  # histogram on linear scale plt.subplot(211) hist, bins, _ = plt.hist(x, bins=8)  # histogram on log scale.  # Use non-equal bin sizes, such that they look equal on log scale. logbins = np.logspace(np.log10(bins[0]),np.log10(bins[-1]),len(bins)) plt.subplot(212) plt.hist(x, bins=logbins) plt.xscale('log') plt.show() 

enter image description here

