When I try to replicate the example here, my violin plots (with my data) don't show the median and median, along with the 25th and 75th percentile, but the original example does.
It also seems to ignore the argument "sticks" to it.
Here is what I tried:
sns.violinplot(df, "stick", color="pastel")
and here is what I get:

while the original looks like the following (for sns.violinplot(df, color="pastel")):

This problem does not affect boxplots.
Here is a minimal example which results in (very similar) shapes:
> df
A B
0RS0NrQDHHx NaN 19.727869
0RS232Ak80k NaN 32.552973
0RSECe1NRShE NaN 44.369213
0RSHVQNT16d NaN 11.306910
0RSO4JcoLeb NaN -7.935776
0RSOrrpKlRu NaN 39.489909
0RSVIHDWBR1 NaN 52.830051
0RSWe5CE1Hk NaN 26.913323
0RSXhLG3Kp8 -1.921543 NaN
0RSc8uRSessd 27.028029 NaN
0RScRSZoDX72 12.713600 NaN
0RSdwNiizS0 28.859158 NaN
0RSeWHWRSww3 12.537717 NaN
0RSrs6jjCsM 5.135179 NaN
0RStNwVhvO1 -55.566641 NaN
0RStQI2VH5A -15.119272 NaN
0RStWRWmH8V -2.369918 NaN
0RSukeajMJy -0.904298 NaN
0RSvJezMyrx -1.105769 NaN
0RSx5WRStDjG 0.899420 NaN
Try sns.violinplot(df, inner="stick", color="pastel"). The second positional argument is a grouping variable. (Although, inner="stick" shows each observation. If you want the 25, 50, and 75th percentiles, do inner="box").
Also to handle a relatively sparse dataframe with lots of NAs, e.g.
df = pd.DataFrame(np.random.randn(20, 5), columns=list("ABCDE"))
for i, c in zip(range(5, 10), df.columns):
df.loc[i, c] = np.nan
you could do
plot_vals = [v.dropna() for k, v in foo.iteritems()]
sns.violinplot(plot_vals, names=df.columns)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With