I am creating histograms of data organized in a dataframe and grouped by days. It might happen that in some days the data is identically null. Therefore, when I plot the histogram using the normed = True property, I would expect one single bin centered in zero and with height equal to 1. However, I see that the height is equal to the number of bins. How can I fix this? I want to represent a probability density function with the histogram, so the maximum value should be 1.
Code sample and output:
plt.rcParams['figure.figsize'] = 10, 4
data = np.zeros((1000))
l = plt.hist(data,normed = True, bins = 100)

EDIT: I saw now that the property normed is deprecated. However, if I try to use the attribute density, I get the error AttributeError: Unknown property density
The plot you see is correct because the area under the curve (histogram/bar) should be 1. This is indeed the case in your plot. To highlight this, I create a vertical line at x=0.01 and you will notice that the width of the bar is indeed 0.01. Since the height of the bar is 100, the area is 100 * 0.01 = 1.
plt.rcParams['figure.figsize'] = 10, 4
data = np.zeros((1000))
l = plt.hist(data,normed = True, bins = 100)
plt.axvline(0.01, lw=1)
plt.ylim(0, 150)
The same happens if you use density=True as
l = plt.hist(data,density = True, bins = 100)

Using the suggestion of jdehesa, following works your way
l = plt.hist(data,density = True, bins=np.arange(-10, 11))

Using the suggestion of DavidG based on this answer gives you a height of 1 but the area is not normalized to 1.
weights = np.ones_like(data)/float(len(data))
l = plt.hist(data,weights=weights)

Finally, if you need a height of 1 and a width of 1 (hence area = 1) and also the normalized area, you can use a single bin as
l = plt.hist(data, density=True, bins=1)
plt.xlim(-10, 10)

As other have explained, normed=True (or density=True in recent versions of Matplotlib) makes the area under the histogram equal to 1. You can get a histogram that represents the fraction of the sample falling on each bin like this:
import matplotlib.pyplot as plt
import numpy as np
data = np.zeros((1000))
# Compute histogram
hist, bins = np.histogram(data, density=True, bins=100)
# Width of each bin
bins_w = np.diff(bins)
# Compute proportion of sample in each bin
hist_p = hist * bins_w
# Plot histogram
plt.bar(bins[:-1], hist_p, width=bins_w, align='edge')
Result:

You could also make a histogram where each bin has a width of 1, but that is a more limited solution.
EDIT: As pointed out in other answers, this is basically equivalent to giving the proper weights parameter to plt.hist.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With