Is there a way to add the mean and median to Seaborn's displot
?
penguins = sns.load_dataset("penguins")
g = sns.displot(
data=penguins, x='body_mass_g',
col='species',
facet_kws=dict(sharey=False, sharex=False)
)
Based on Add mean and variability to seaborn FacetGrid distplots, I see that I can define a FacetGrid
and map a function. Can I pass a custom function to displot
?
The reason for trying to use displot
directly is that the plots are much prettier out of the box, without tweaking tick label size, axis label size, etc. and are visually consistent with other plots I am making.
def specs(x, **kwargs):
ax = sns.histplot(x=x)
ax.axvline(x.mean(), color='k', lw=2)
ax.axvline(x.median(), color='k', ls='--', lw=2)
g = sns.FacetGrid(data=penguins, col='species')
g.map(specs,'body_mass_g' )
Add Mean line to Histogram with axvline() We will use Matplotlib's axvline() function to add mean line to the histogram made with Seaborn's displot(). We also specify color argument to make the mean line in red color.
The main idea of Seaborn is that it provides high-level commands to create a variety of plot types useful for statistical data exploration, and even some statistical model fitting. Let's take a look at a few of the datasets and plot types available in Seaborn.
FacetGrid
directly is not recommended. Instead, use other figure-level methods like seaborn.displot
seaborn.FacetGrid.map
works with figure-level methods.python 3.8.11
, pandas 1.3.2
, matplotlib 3.4.3
, seaborn 0.11.2
plt.
instead of ax
.
vlines
are going to ax
for the histplot
, but here, the figure is created before .map
.penguins = sns.load_dataset("penguins")
g = sns.displot(
data=penguins, x='body_mass_g',
col='species',
facet_kws=dict(sharey=False, sharex=False)
)
def specs(x, **kwargs):
plt.axvline(x.mean(), c='k', ls='-', lw=2.5)
plt.axvline(x.median(), c='orange', ls='--', lw=2.5)
g.map(specs,'body_mass_g' )
displot
.import seaborn as sns
import pandas as pd
# load the data
pen = sns.load_dataset("penguins")
# groupby to get mean and median
pen_g = pen.groupby('species').body_mass_g.agg(['mean', 'median'])
g = sns.displot(
data=pen, x='body_mass_g',
col='species',
facet_kws=dict(sharey=False, sharex=False)
)
# extract and flatten the axes from the figure
axes = g.axes.flatten()
# iterate through each axes
for ax in axes:
# extract the species name
spec = ax.get_title().split(' = ')[1]
# select the data for the species
data = pen_g.loc[spec, :]
# print data as needed or comment out
print(data)
# plot the lines
ax.axvline(x=data['mean'], c='k', ls='-', lw=2.5)
ax.axvline(x=data['median'], c='orange', ls='--', lw=2.5)
Here you can use sns.FacetGrid.facet_data
to iterate the indexes of the subplots and the underlying data.
This is close to how sns.FacetGrid.map
works under the hood. sns.FacetGrid.facet_data
is a generator that yields a tuple (i, j, k)
of row, col, hue index and the data
which is a DataFrame that is a subset of the full data corresponding to each facet.
import seaborn as sns
import pandas as pd
pen = sns.load_dataset("penguins")
# Set our x_var for later use
x_var = "body_mass_g"
g = sns.displot(
data=pen,
x=x_var,
col="species",
facet_kws=dict(sharey=False, sharex=False),
)
for (row, col, hue_idx), data in g.facet_data():
# Skip empty data
if not data.values.size:
continue
# Get the ax for `row` and `col`
ax = g.facet_axis(row, col)
# Set the `vline`s using the var `x_var`
ax.axvline(data[x_var].mean(), c="k", ls="-", lw=2.5)
ax.axvline(data[x_var].median(), c="orange", ls="--", lw=2.5)
Which outputs:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With