Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plot a histogram such that the total area of the histogram equals 1 (density)

This is a follow-up question to this answer. I'm trying to plot normed histogram, but instead of getting 1 as maximum value on y axis, I'm getting different numbers.

For array k=(1,4,3,1)

 import numpy as np   def plotGraph():         import matplotlib.pyplot as plt          k=(1,4,3,1)      plt.hist(k, normed=1)      from numpy import *     plt.xticks( arange(10) ) # 10 ticks on x axis      plt.show()        plotGraph() 

I get this histogram, that doesn't look like normed.

enter image description here

For a different array k=(3,3,3,3)

 import numpy as np   def plotGraph():         import matplotlib.pyplot as plt          k=(3,3,3,3)      plt.hist(k, normed=1)      from numpy import *     plt.xticks( arange(10) ) # 10 ticks on x axis      plt.show()        plotGraph() 

I get this histogram with max y-value is 10.

enter image description here

For different k I get different max value of y even though normed=1 or normed=True.

Why the normalization (if it works) changes based on the data and how can I make maximum value of y equals to 1?

UPDATE:

I am trying to implement Carsten König answer from plotting histograms whose bar heights sum to 1 in matplotlib and getting very weird result:

import numpy as np  def plotGraph():      import matplotlib.pyplot as plt      k=(1,4,3,1)      weights = np.ones_like(k)/len(k)     plt.hist(k, weights=weights)      from numpy import *     plt.xticks( arange(10) ) # 10 ticks on x axis      plt.show()    plotGraph() 

Result:

enter image description here

What am I doing wrong?

like image 501
user40 Avatar asked Mar 07 '14 04:03

user40


People also ask

How do you normalize a histogram in python 1?

To normalize a histogram in Python, we can use hist() method. In normalized bar, the area underneath the plot should be 1.

What is density in histogram?

It is the area of the bar that tells us the frequency in a histogram, not its height. Instead of plotting frequency on the y-axis, we plot the frequency density. To calculate this, you divide the frequency of a group by the width of it.


2 Answers

When plotting a normalized histogram, the area under the curve should sum to 1, not the height.

In [44]:  import matplotlib.pyplot as plt k=(3,3,3,3) x, bins, p=plt.hist(k, density=True)  # used to be normed=True in older versions from numpy import * plt.xticks( arange(10) ) # 10 ticks on x axis plt.show()   In [45]:  print bins [ 2.5  2.6  2.7  2.8  2.9  3.   3.1  3.2  3.3  3.4  3.5] 

Here, this example, the bin width is 0.1, the area underneath the curve sums up to one (0.1*10).

x stores the height for each bins. p stores each of those individual bins objects (actually, they are patches. So we just sum up x and modify the height of each bin object.

To have the sum of height to be 1, add the following before plt.show():

for item in p:     item.set_height(item.get_height()/sum(x)) 

enter image description here

like image 176
CT Zhu Avatar answered Oct 20 '22 10:10

CT Zhu


You could use the solution outlined here:

weights = np.ones_like(myarray)/float(len(myarray)) plt.hist(myarray, weights=weights) 
like image 39
upceric Avatar answered Oct 20 '22 10:10

upceric