Is there a way to tell matplotlib to "normalize" a histogram such that its area equals a specified value (other than 1)?
The option "normed = 0" in
n, bins, patches = plt.hist(x, 50, normed=0, histtype='stepfilled')
just brings it back to a frequency distribution.
To normalize a histogram in Python, we can use hist() method. In normalized bar, the area underneath the plot should be 1.
We can normalize a histogram in Matplotlib using the density keyword argument and setting it to True . By normalizing a histogram, the sum of the bar area equals 1.
Using MinMaxScaler() to Normalize Data in Python This is a more popular choice for normalizing datasets. You can see that the values in the output are between (0 and 1). MinMaxScaler also gives you the option to select feature range. By default, the range is set to (0,1).
Just calculate it and normalize it to any value you'd like, then use bar
to plot the histogram.
On a side note, this will normalize things such that the area of all the bars is normed_value
. The raw sum will not be normed_value
(though it's easy to have that be the case, if you'd like).
E.g.
import numpy as np
import matplotlib.pyplot as plt
x = np.random.random(100)
normed_value = 2
hist, bins = np.histogram(x, bins=20, density=True)
widths = np.diff(bins)
hist *= normed_value
plt.bar(bins[:-1], hist, widths)
plt.show()
So, in this case, if we were to integrate (sum the height multiplied by the width) the bins, we'd get 2.0 instead of 1.0. (i.e. (hist * widths).sum()
will yield 2.0
)
You can pass a weights
argument to hist
instead of using normed
. For example, if your bins cover the interval [minval, maxval]
, you have n
bins, and you want to normalize the area to A
, then I think
weights = np.empty_like(x)
weights.fill(A * n / (maxval-minval) / x.size)
plt.hist(x, bins=n, range=(minval, maxval), weights=weights)
should do the trick.
EDIT: The weights
argument must be the same size as x
, and its effect is to make each value in x contribute the corresponding value in weights
towards the bin count, instead of 1.
I think the hist
function could probably do with a greater ability to control normalization, though. For example, I think as it stands, values outside the binned range are ignored when normalizing, which isn't generally what you want.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With