Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add a mean and median line to a Seaborn displot

Is there a way to add the mean and median to Seaborn's displot?

penguins = sns.load_dataset("penguins")
g = sns.displot(
    data=penguins, x='body_mass_g',
    col='species',  
    facet_kws=dict(sharey=False, sharex=False)
)

enter image description here

Based on Add mean and variability to seaborn FacetGrid distplots, I see that I can define a FacetGrid and map a function. Can I pass a custom function to displot?

The reason for trying to use displot directly is that the plots are much prettier out of the box, without tweaking tick label size, axis label size, etc. and are visually consistent with other plots I am making.

def specs(x, **kwargs):
    ax = sns.histplot(x=x)
    ax.axvline(x.mean(), color='k', lw=2)
    ax.axvline(x.median(), color='k', ls='--', lw=2)

g = sns.FacetGrid(data=penguins, col='species')
g.map(specs,'body_mass_g' )

enter image description here

like image 236
a11 Avatar asked May 20 '21 03:05

a11


People also ask

How do you add a mean line in Seaborn?

Add Mean line to Histogram with axvline() We will use Matplotlib's axvline() function to add mean line to the histogram made with Seaborn's displot(). We also specify color argument to make the mean line in red color.

Can Seaborn generate graphic plots?

The main idea of Seaborn is that it provides high-level commands to create a variety of plot types useful for statistical data exploration, and even some statistical model fitting. Let's take a look at a few of the datasets and plot types available in Seaborn.


2 Answers

  • Using FacetGrid directly is not recommended. Instead, use other figure-level methods like seaborn.displot
    • seaborn.FacetGrid.map works with figure-level methods.
    • seaborn: Building structured multi-plot grids
  • Tested in python 3.8.11, pandas 1.3.2, matplotlib 3.4.3, seaborn 0.11.2

Option 1

  • Use plt. instead of ax.
    • In the OP, the vlines are going to ax for the histplot, but here, the figure is created before .map.
penguins = sns.load_dataset("penguins")
g = sns.displot(
    data=penguins, x='body_mass_g',
    col='species',  
    facet_kws=dict(sharey=False, sharex=False)
)

def specs(x, **kwargs):
    plt.axvline(x.mean(), c='k', ls='-', lw=2.5)
    plt.axvline(x.median(), c='orange', ls='--', lw=2.5)

g.map(specs,'body_mass_g' )

Option 2

  • This option is more verbose, but more flexible in that it allows for accessing and adding information from a data source other than the one used to create the displot.
import seaborn as sns
import pandas as pd

# load the data
pen = sns.load_dataset("penguins")

# groupby to get mean and median
pen_g = pen.groupby('species').body_mass_g.agg(['mean', 'median'])

g = sns.displot(
    data=pen, x='body_mass_g',
    col='species',  
    facet_kws=dict(sharey=False, sharex=False)
)
# extract and flatten the axes from the figure
axes = g.axes.flatten()

# iterate through each axes
for ax in axes:
    # extract the species name
    spec = ax.get_title().split(' = ')[1]
    
    # select the data for the species
    data = pen_g.loc[spec, :]
    
    # print data as needed or comment out
    print(data)
    
    # plot the lines
    ax.axvline(x=data['mean'], c='k', ls='-', lw=2.5)
    ax.axvline(x=data['median'], c='orange', ls='--', lw=2.5)

Output for both options

enter image description here

Resources

  • Also see the following questions/answers for other ways to add information to a seaborn FacetGrid
    • Draw a line at specific position/annotate a Facetgrid in seaborn
    • Overlay a vertical line on seaborn scatterplot with multiple subplots
    • How to add additional plots to a seaborn FacetGrid and specify colors
like image 113
Trenton McKinney Avatar answered Oct 19 '22 00:10

Trenton McKinney


Here you can use sns.FacetGrid.facet_data to iterate the indexes of the subplots and the underlying data.

This is close to how sns.FacetGrid.map works under the hood. sns.FacetGrid.facet_data is a generator that yields a tuple (i, j, k) of row, col, hue index and the data which is a DataFrame that is a subset of the full data corresponding to each facet.

import seaborn as sns
import pandas as pd


pen = sns.load_dataset("penguins")

# Set our x_var for later use
x_var = "body_mass_g"

g = sns.displot(
    data=pen,
    x=x_var,
    col="species",
    facet_kws=dict(sharey=False, sharex=False),
)

for (row, col, hue_idx), data in g.facet_data():
    # Skip empty data
    if not data.values.size:
        continue

    # Get the ax for `row` and `col`
    ax = g.facet_axis(row, col)
    # Set the `vline`s using the var `x_var`
    ax.axvline(data[x_var].mean(), c="k", ls="-", lw=2.5)
    ax.axvline(data[x_var].median(), c="orange", ls="--", lw=2.5)

Which outputs: FacetGrid with overlayed vlines for mean and median

like image 41
Alex Avatar answered Oct 19 '22 01:10

Alex