I have Pandas Dataframe with structure:
A B
0 1 1
1 2 1
2 3 4
3 3 7
4 6 8
How do I generate a Seaborn Violin plot with each column as its own separate violin plot for side-by-side comparison?
A violin plot is more informative than a plain box plot. While a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data. The difference is particularly useful when the data distribution is multimodal (more than one peak).
You have to remove them manually before passing the data for plotting. If we look into seaborn's violin plot api, there is no parameter specified for removing extreme outliers. @JohanC I would personally clean up the data, just provided an alternative ;) You'll need to manage the data in the dataframe and then plot.
Grouping Violin Plots by HueIf you have a categorical value, that has two values (typically, a true / false -style variable), you can group plots by hue. For example, you could have a dataset of people, and an employment column, with employed and unemployed as values.
seaborn
(at least, version 0.8.1; not sure if this is new) supports what you want without messing around with your dataframe at all:
import pandas as pd
import seaborn as sns
df = pd.DataFrame({'A': [1, 2, 3, 3, 6], 'B': [1, 1, 4, 7, 8]})
sns.violinplot(data=df)
(Note that you do need to set data=df
; if you just pass in df
as the first argument (equivalent to setting x=df
in the function call), it seems like it concatenates the columns together and then makes a violin plot of all of the data)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With