How to plot two violin plot series on the same graph using seaborn?

Tags:

Looking at the documentation about violon plots with seaborn, I would like to know how to plot two series of violin plots on the same axis (point 1) and that they are comparable (point 2).

About point 1, I would like to reproduce that plot for each sex :

fig, ax = plt.subplots()
sns.violinplot(x="day", y="total_bill", hue="smoker",
                    data=tips, split=True, ax=ax)

I could do it on two subplots:

fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(211)
sns.violinplot(x="day", y="total_bill", hue="smoker",
               data=tips[tips.sex == "Female"], split=True, ax=ax)

ax = fig.add_subplot(212)
sns.violinplot(x="day", y="total_bill", hue="smoker",
               data=tips[tips.sex == "Male"], split=True, ax=ax)

I would like to plot the two violin plot series on the same matplotlib axes.

Another point is about the width of the violin plot. I do not clearly understand if the violins are normalized or not and how ? I assume that the width is computed for each plot. On the example above the width is computed for Female for the first subplot and for Male for the second subplot. Thus can I compare directly the densities ? I suppose that I can compare the shapes but, for example, I cannot compare the amount of Male smoker on Monday and Female ones ? Is there a way to manage the normalization of the violins ?

860

asked Nov 21 '17 09:11

Ger

2 Answers

For you first point, there is no way of doing this in Seaborn. Check out my comments for possible workarounds, but in brief, I don't think the time spent would be worthwhile.

For you second question, the scale and scale_hue parameters to violinplot controls how the violin patch is normalized/scaled:

scale : {“area”, “count”, “width”}, optional

The method used to scale the width of each violin. If area, each violin will have the same area. If count, the width of the violins will be scaled by the number of observations in that bin. If width, each violin will have the same width.

scale_hue : bool, optional

When nesting violins using a hue variable, this parameter determines whether the scaling is computed within each level of the major grouping variable (scale_hue=True) or across all the violins on the plot (scale_hue=False).

The defaults are 'area' and False. You can see below how varying these parameters affects the violins. For example, if you want to compare between plots and represent the absolute counts truthfully, you could set scale='count' and scale_hue=False. Note that the violins would still be scaled to the maximum count within the plot (and not in the dataset), so in our case, the biggest violin for females would represent ~40 observations while the biggest violin for males would represent ~25 observations.

fig, axes = plt.subplots(4, 2, figsize=(10, 16), sharey='row')
axes_cols = (axes.flatten()[::2], axes.flatten()[1::2])

for (sex_name, sex), axes_col in zip(tips.groupby('sex'), axes_cols):
    sns.countplot(x="day", hue="smoker", data=sex, ax=axes_col[0])
    for scale, ax in zip(['area', 'count', 'width'], axes_col[1:]):
        sns.violinplot(x="day", y="total_bill", hue="smoker",
            data=sex, split=True, ax=ax, scale=scale)
        ax.set_title('scale = {}'.format(scale), y=0.95)
sns.despine()
fig.tight_layout()

enter image description here

Adding scale_hue=False: enter image description here

180

answered Oct 21 '22 01:10

joelostblom

As said Joel Ostblom in ist answer, there is no way to plot several violin plots natively in seaborn. I suggest here a solution following his hints. I also add a count plots on the top, as he did in his answer.

I do not really know the tips data set, thus the plot below is certainly meaningless.

import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="whitegrid", font_scale=1.5)

tips = sns.load_dataset("tips")
# here you add a new column with the two categorical data you want
tips["sex_time"] = tips[["sex", "time"]].apply(lambda x: "_".join(x), axis=1)

fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(8, 10), 
                         sharex=True, gridspec_kw=dict(height_ratios=(1, 3), hspace=0))

# select the order you want:
order=["Female_Lunch", "Male_Lunch", "Female_Dinner", "Male_Dinner"]

sns.countplot(
    data=tips, x="sex_time", hue="smoker", 
    order=order,
    ax=axes[0]
)

sns.violinplot(
    x="sex_time", y="total_bill", hue="smoker", data=tips, 
    split=True, scale="count", scale_hue=False, inner="stick",
    order=order,
    ax=axes[1]
)
axes[1].set_xticklabels(["Lunch (Female)", "Lunch (Male)", "Dinner (Female)", "Dinner (Male)"])
axes[1].set_xlabel("Time (Sex)")
axes[1].legend("")

enter image description here

answered Oct 21 '22 01:10

Ger

Related questions
                            
                                Convert csv to JSON tree structure?
                            
                                'numpy.ndarray' object has no attribute 'imshow'
                            
                                rgb to yuv conversion and accessing Y, U and V channels
                            
                                ANOVA for groups within a dataframe using scipy
                            
                                Byte code of a compiled script differs based on how it was compiled [duplicate]
                            
                                Python class methods: when is self not needed
                            
                                Check for valid domain name in a string?
                            
                                Popping first element from a Python tuple
                            
                                How can I get sign bit of an integer in python?
                            
                                How to include the function name into logging
                            
                                all permutations of +-r, +-s
                            
                                Easy parallelization of numpy.apply_along_axis()?
                            
                                Tensorflow: ValueError: Can't load save_path when it is None in single shot detection
                            
                                How do you declare python variables within flask templates?
                            
                                Compose dynamic SQL string with psycopg2
                            
                                Keeping the last N duplicates in pandas
                            
                                ModuleNotFoundError: No module named 'cv2'
                            
                                Python's requests library timing out but getting the response from the browser
                            
                                Converting a Pandas Dataframe column into one hot labels
                            
                                How to switch environment on Jupyter notebook for new notebook?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to plot two violin plot series on the same graph using seaborn?

Tags:

python

matplotlib

seaborn

violin-plot

Ger

People also ask

2 Answers

joelostblom

Ger

Recent Activity

Donate For Us