I would like to add a density plot to my histogram diagram. I know something about pdf function but I've got confused and other similar questions were not helpful.
from scipy.stats import *
from numpy import*
from matplotlib.pyplot import*
from random import*
nums = []
N = 100
for i in range(N):
a = randint(0,9)
nums.append(a)
bars= [0,1,2,3,4,5,6,7,8,9]
alpha, loc, beta=5, 100, 22
hist(nums,normed= True,bins = bars)
show()
I'm looking for something like this
A kernel density estimate (KDE) plot is a method for visualizing the distribution of observations in a dataset, analogous to a histogram. KDE represents the data using a continuous probability density curve in one or more dimensions.
Kdeplot is a Kernel Distribution Estimation Plot which depicts the probability density function of the continuous or non-parametric data variables i.e. we can plot for the univariate or multiple variables altogether. Using the Python Seaborn module, we can build the Kdeplot with various functionality added to it.
Kernel density estimation is a non-parametric way to estimate the distribution of a variable. In seaborn, we can plot a kde using jointplot(). Pass value 'kde' to the parameter kind to plot kernel plot.
A histogram puts all samples between the boundaries of each bin will fall into the bin. It doesn't differentiate whether the value falls close the left, to the right or the center of the bin. A kde plot, on the other hand, takes each individual sample value and draws a small gaussian bell curve over it.
from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(41)
N = 100
x = np.random.randint(0, 9, N)
bins = np.arange(10)
kde = stats.gaussian_kde(x)
xx = np.linspace(0, 9, 1000)
fig, ax = plt.subplots(figsize=(8,6))
ax.hist(x, density=True, bins=bins, alpha=0.3)
ax.plot(xx, kde(xx))
Here's a solution using seaborn
0.11.1 and pandas
1.1.5:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np
N = 100
nums = [np.random.randint(i-i, 9) for i in range(N)]
df = pd.DataFrame(nums, columns=["value"])
fig, ax1 = plt.subplots()
sns.kdeplot(data=df, x="value", ax=ax1)
ax1.set_xlim((df["value"].min(), df["value"].max()))
ax2 = ax1.twinx()
sns.histplot(data=df, x="value", discrete=True, ax=ax2)
Note how I use numpy
to generate the random values because I need actual values, not generators. The discrete=True
in the last line assures that the ticks are centered.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With