Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting a gaussian fit to histgram in seaborn displot/histplot function (NOT distplot)

I've decided to give seaborn version 0.11.0 a go! Playing around with the displot function which will replace distplot, as I understand it. I'm just trying to figure out how to plot a gaussian fit on to a histogram. Here's some example code.

import seaborn as sns
import numpy as np
x = np.random.normal(size=500) * 0.1

With distplot I could do:

sns.distplot(x, kde=False, fit=norm)

enter image description here

But how to go about it in displot or histplot?

like image 653
UserR6 Avatar asked Oct 31 '20 11:10

UserR6


2 Answers

Sorry I am late to the party. Just check if this will meet your requirement.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

data = np.random.normal(size=500) * 0.1
mu, std = norm.fit(data)

# Plot the histogram.
plt.hist(data, bins=25, density=True, alpha=0.6, color='g')

# Plot the PDF.
xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 100)
p = norm.pdf(x, mu, std)
plt.plot(x, p, 'k', linewidth=2)
plt.show()

enter image description here

like image 149
Regi Mathew Avatar answered Sep 27 '22 23:09

Regi Mathew


I really miss the fit parameter too. It doesn't appear they replaced that functionality when they deprecated the distplot function. Until they plug that hole, I created a short function to add the normal distribution overlay to my histplot. I just paste the function at the top of a file along with the imports, and then I just have to add one line to add the overlay when I want it.

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

def normal(mean, std, color="black"):
    x = np.linspace(mean-4*std, mean+4*std, 200)
    p = stats.norm.pdf(x, mean, std)
    z = plt.plot(x, p, color, linewidth=2)

data = np.random.normal(size=500) * 0.1    
ax = sns.histplot(x=data, stat="density")
normal(data.mean(), data.std())

enter image description here

If you would rather use stat="probability" instead of stat="density", you can normalize the fit curve with something like this:

def normal(mean, std, histmax=False, color="black"):
    x = np.linspace(mean-4*std, mean+4*std, 200)
    p = stats.norm.pdf(x, mean, std)
    if histmax:
        p = p*histmax/max(p)
    z = plt.plot(x, p, color, linewidth=2)

data = np.random.normal(size=500) * 0.1    
ax = sns.histplot(x=data, stat="probability")
normal(data.mean(), data.std(), histmax=ax.get_ylim()[1])
like image 44
ohtotasche Avatar answered Sep 27 '22 23:09

ohtotasche