Un-normalized Gaussian curve on histogram

Tags:

I have data which is of the gaussian form when plotted as histogram. I want to plot a gaussian curve on top of the histogram to see how good the data is. I am using pyplot from matplotlib. Also I do NOT want to normalize the histogram. I can do the normed fit, but I am looking for an Un-normalized fit. Does anyone here know how to do it?

Thanks! Abhinav Kumar

987

asked Jul 22 '13 03:07

Abhinav Kumar

2 Answers

As an example:

import pylab as py
import numpy as np
from scipy import optimize

# Generate a 
y = np.random.standard_normal(10000)
data = py.hist(y, bins = 100)

# Equation for Gaussian
def f(x, a, b, c):
    return a * py.exp(-(x - b)**2.0 / (2 * c**2))

# Generate data from bins as a set of points 
x = [0.5 * (data[1][i] + data[1][i+1]) for i in xrange(len(data[1])-1)]
y = data[0]

popt, pcov = optimize.curve_fit(f, x, y)

x_fit = py.linspace(x[0], x[-1], 100)
y_fit = f(x_fit, *popt)

plot(x_fit, y_fit, lw=4, color="r")

enter image description here

This will fit a Gaussian plot to a distribution, you should use the pcov to give a quantitative number for how good the fit is.

A better way to determine how well your data is Gaussian, or any distribution is the Pearson chi-squared test. It takes some practise to understand but it is a very powerful tool.

answered Sep 28 '22 08:09

Greg

An old post I know, but wanted to contribute my code for doing this, which simply does the 'fix by area' trick:

from scipy.stats import norm
from numpy import linspace
from pylab import plot,show,hist

def PlotHistNorm(data, log=False):
    # distribution fitting
    param = norm.fit(data) 
    mean = param[0]
    sd = param[1]

    #Set large limits
    xlims = [-6*sd+mean, 6*sd+mean]

    #Plot histogram
    histdata = hist(data,bins=12,alpha=.3,log=log)

    #Generate X points
    x = linspace(xlims[0],xlims[1],500)

    #Get Y points via Normal PDF with fitted parameters
    pdf_fitted = norm.pdf(x,loc=mean,scale=sd)

    #Get histogram data, in this case bin edges
    xh = [0.5 * (histdata[1][r] + histdata[1][r+1]) for r in xrange(len(histdata[1])-1)]

    #Get bin width from this
    binwidth = (max(xh) - min(xh)) / len(histdata[1])           

    #Scale the fitted PDF by area of the histogram
    pdf_fitted = pdf_fitted * (len(data) * binwidth)

    #Plot PDF
    plot(x,pdf_fitted,'r-')

answered Sep 28 '22 09:09

Colin O'Flynn

Related questions
                            
                                In Python, efficiently determine if two lists are shifted copies of one another
                            
                                Nested tags in BeautifulSoup - Python
                            
                                Can I use python slicing to access one "column" of a nested tuple?
                            
                                Django models.FileField - store only the file name not any paths or folder references
                            
                                How do you rotate the numbers in an numpy array of shape (n,) or (n,1)?
                            
                                Train scikit svm one by one (online or stochastic training)
                            
                                Want to find a way of doing an average of multiple lists
                            
                                Command output parsing in Python
                            
                                Convert numpy scalar to simple python type [duplicate]
                            
                                Python "'module' object is not callable"
                            
                                How to download a zip file from a site (python) [closed]
                            
                                Django: how to log exceptions from management commands?
                            
                                How do I create a numpy array using a function?
                            
                                iterate python nested lists efficiently
                            
                                os.system vs subprocess in python on linux
                            
                                PyQt 4: Making a label scrollable
                            
                                Jinja has a "center" formatting option, but how about "right align"?
                            
                                Pymongo Not creating collection in mongodb
                            
                                Get all text from an XML document?
                            
                                Geopy: calculating GPS heading / bearing

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Un-normalized Gaussian curve on histogram

Tags:

python

matplotlib

histogram

gaussian

Abhinav Kumar

People also ask

2 Answers

Greg

Colin O'Flynn

Recent Activity

Donate For Us